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NOVEL PROTEINS AND NUCLEIC ACIDS ENCODING SAME 



FIELD OF THE INVENTION 

The invention generally relates to nucleic acids and polypeptides encoded therefrom. 

5 

BACKGROUND OP THE INVENTION 

The invention generally relates to nucleic acids and polypeptides encoded therbfiom. 
More specifically, flie invention relates to nucleic acids encoding cytoplasmic, nuclear, 
membrane bound, and secreted polypeptides^ as well as vectors, host cells, antibodies, and 
10 recombinant methods for producing these nucleic acids and polypeptides. 

SUMMARY OF THE INVENTION 

The invention is based in part upon the discovery of nucleic acid sequences encoding 
novel polypeptides. The novel nucleic acids and polypeptides are referred to herein as NOVX, 

15 or NOVl, N0V2, N0V3, N0V4, N0V5, N0V6, N0V7, NOV8, N0V9, and NOVIO nucleic 
acids and polypeptides. These nucleic acids and po]>peptides^ as well as derivatives, 
homologs, analogs and fragments thereof, will hereinafter be collectively designated as 
*T^OVX" nucleic acid or polypeptide sequences. 

In one aspect, the invention provides an isolated NOVX nucleic acid molecule 

20 encoding a NOVX polypeptide that includes a nucleic acid sequence that has identity to the 
nucleic acids disclosed in SEQ ID NOSil, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, and 25. In 
some embodiments, the NOVX nucleic acid molecule will hybridize under stringent 
conditions to a nucleic acid sequence complement^ to a nucleic acid molecule that includes 
a protein=coding sequence of a NOVX nucleic acid sequence. The inv^tion also includes an 

25 isolated nucleic acid that encodes a NOVX polypeptide, or a fragment, homolog, analog or 
derivative thereof For example, the nucleic acid can encode a polypeptide at least 80% 
identical to a polypeptide comprising the amino acid sequences of SEQ ID N0S:2, 4, 6, 8, 10, 
12, 14, 16, 18, 20, 22, 24, and 26. The nucleic acid can be, for example, a genomic DNA 
fragment or a cDNA molecule that includes tibe nucleic acid sequence of any of SEQ ID 

30 N0S;1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, and 25. 
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Also included in Qie invention is an oligonucleotide, eg., an oligonucleotide which 
includes at least 6 contiguous nucleotides of aNOVX nucleic acid (e.g., SEQ ID N0S;1, 3, 5, 
7, 9, 11,13, 15, 17, 19, 21, 23, and 25) or a complement of said oligonucleotide. 

Also included in the invention are substantially purified NOVX polypeptides (SEQ ID 
5 N0S:2, 4, 6, 8, 10, 12, 14, 16, 1 8, 20, 22, 24, and 26). In certain embodiments, the NOVX 
polypeptides include an amino acid sequence that is substantially identical to the amino acid 
sequence of a human NOVX polypeptide, 

The Inveoation also features antibodies that immunoselectively bind to NOVX 
polypeptides, or fi^gments, homologs, analogs or derivatives thereof 
10 In another aspect, the invention includes pharmaceutical compositions that include 

therapeutically- or prophylactically-efifective amounts of a therapeutic and a phannaceutically- 
acceptable carrier. The therapeutic can be, e.^>, a NOVX nucleic acid, a NOVX polypeptide, 
or an antibody specific for a NOVX polypeptide. In a further a^ect, the invention includes, in 
one or more containers, a therapeutically- or prophylactically-effective amount of this 
15 pharmaceutical composition. 

In a further aspect, the invention includes a method of producing a polypeptide by 
culturing a cell that includes a NOVX nucleic acid, under conditions allowing for expression 
of the NOVX polypeptide encoded by the DNA. If desired, the NOVX polypeptide can then 
be recovered. 

20 In another aspect, the inv^ition includes a method of detecting the presence of a 

NOVX polypeptide in a sample. In the method, a san^le is contacted with a compound that 
selectively binds to the polypeptide und^ conditions allowmg for formation of a complex 
between the polypeptide and the compound The complex is detected, if present, thereby 
identifying the NOVX polypeptide within the sample. 

25 The invention also includes metiiods to identify specific cell or tissue types based on 

their expression of a NOVX. 

Also included in the invention is a method of detecting the presence of a NOVX 
nucleic acid molecule in a sample by contacting the sample with a NOVX nucleic acid probe 
or primer, and detecting whether the nucleic acid probe or primer bound to a NOVX nucleic 

30 acid molecule in the sample. 

In a further aspect, the invention provides a method for modulating the activity of a 
NOVX polypeptide by contacting a cell sample that includes the NOVX polypeptide with a 
compound that binds to the NOVX polypeptide in an amount sufficient to modulate the 
activity of said polypq)tide. The compound can be, e,g,, a small molecule, such as a nucleic 
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acid, peptide, polypeptide, peptidomimetic, carbohydrate, lipid or other organic (carbon 

containing) or inorganic molecules as further described herein. 

Also within the scope of the invention is the use of a therapeutic in the manufacture of 

a medicament for treating or preventing disorders or syndromes including, e.g., diabetes, 

5 metabolic disturbances associated with obesity, the metabolic syndrome X, anorexia, wasting 

disorders associated with chronic diseases, metabohc disorders, diabetes, obesity, infectious 

disease, anorexia, cancer-associated cachexia, cancer, neurodegenerative disorders^ 

Alzheimer*s Disease, Parkinson's Disorder, immune disorders, and hematopoietic disorders, 

or other disorders related to cell signal processing and metabolic pathway modulation, The 

1 0 therapeutic can be, e.g. , a NOVX nucleic acid, a NOVX polypeptide, or a NOVX-specific 
antibody, or biologically-active derivatives or fragments thereof. 

For example, the compositions of the present invention will have efficacy for treatment 
of patients suffering from: developmental diseases, MHCII and ip diseases (immune 
diseases), taste and scent detectability Disorders, Burkitt's lymphoma, corticoncmrogenic 

1 5 disease, signal transduction pathway disorders, Retinal diseases including those involving 
photoreception. Cell growth rate disorders; cell shape disorders, feeding disorders; control of 
feeding; potential obesity due to over-eating; potential disorders due to starvation (lack of 
appetite), noninsulin-dependent diabetes mellitus (NIDDMl), bacterial, fungal, protozoal and 
viral infections ^particularly infections caused by HIV-l or HIV-2), pain, cancer (including but 

20 not limited to neoplasm; adenocarcinoma; lymphoma; prostate cancer; uterus cancer), 

anorexia, bulimia, asthma, Parkinson's disease, acute heart failure, hypotension, hypertension, 
urinary relation, osteoporosis, Crohn's disease; multiple sclerosis; Albright Hereditary 
Ostoeodystrophy, angma pectoris, myocardial infarction, ulcers, asthma, allergies, benign 
prostatic hypertrophy, and psychotic and neurological disorders, including anxiety, 

25 schizophrenia, manic depression, delhium, dementia, severe mental retardation. 

Dentatorubro-pallidoluysian atrophy (DRPLA) Hypophosphatemic rickets, autosomal 
dominant (2) AcrocaUosal syndrome and dyskinesias, such as Huntington*s disease or Grilles 
de la Tourette syndrome and/or other pathologies and disorders of the like. 

The polypeptides can be used as immunogens to produce antibodies specific for the 

30 invention, and as vaccines. They can also be used to screen for potential agonist and 
antagonist compounds. For example, a cDNA encoding NOVX may be useful in gene 
therapy, and NOVX may be useful when administered to a subject in need thereof. By way of 
non-hmiting example, the compositions of the present invention will have efficacy for 
treatment of patients suffering from bacterial, fungal, protozoal and viral infections 
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(particularly mfections caused by HIV4 or HIV-2), pain, cancer (including but not limited to 
Neoplasm; adenocafcinoma; lymphoma; prostate cancer, uterus cancer), anorexia, bulimia, 
asthma, Parkinson's disease, acute heart failure, hypotension, hypertension, urinary retention, 
osteoporosis, Crolm's disease; multiple sclerosis; and Treatment of Albright Hereditary 
5 Ostoeodystrophy, angina pectoris, myocardial infarction, ulcers, asthma, allergies, benign 
prostatic hypertrophy, and psychotic and neurological disorders, including anxiety, 
schizophrenia, manic depression, delirium, dementia, severe mental retardation and 
dyskinesias, such as Huntington*s disease or Gilles de la Tourette syndrome and/or other 
pathologies and disorders. 
1 0 The invention further includes a method for screening for a modulator of disorders or 

syndromes including, e.g., diabetes, metabolic disturbances associated with obesity, the 
metaboUc syndrome X, anorexia, wasting disorders associated with chronic diseases, 
metabolic disorders, diabetes, obesity, infectious disease, anorexifa, cancer-associated 
cachexia, cancer, neurodegenerative disorders, Alzheimer's Disease, Parkinson*s Disorder, 

15 immune disorders, and hematopoietic disorders or other disorders related to cell signal 
processing and metabolic pathway modulation. The method includes contacting a test 
compound with a NOVX polypeptide and determining if the test compound binds to said 
NOVX polypeptide. Binding of the test compound to the NOVX polypeptide indicates the test 
compound is a modulator of activity, or of latency or predisposition to the aforementioned 

20 disorders or syndromes. 

Also within the scppe of the invention is a method for screening for a modulator of 
activity, or of latency or predisposition to an disorders or syndromes including, e.g., diabetes, 
metabolic disturbances associated with obesity, the metabolic syndrome X, anorexia, wasting 
disorders associated with chronic diseases, metabolic disorders, diabetes, obesity, infectious 

25 disease, anorexia, cancer-associated cachexia, cancer, neurodegenerative disorders, 

Alzheimer's Disease, Parkinson's Disorder, immune disorders, and hematopoietic disorders or 
other disorders related to cell signal processing and metaboUc pathway modulation by 
administering a test compound to a test animal at increased risk for the aforementioned 
disorders or syndromes. The test animal expresses a recombmant polypeptide encoded by a 

30 NOVX nucleic acid. Expression or activity of NOVX polypeptide is then measured in the test 
animal, as is expression or activity of the protein m a control animal which recombinantly- 
expresses NOVX polypeptide and is not at increased risk for the disorder or syndrome. Next, 
the expression of NOVX polypeptide in both the test animal and the control annual is 
compared. A change in the activity of NOVX polypeptide in the test animal relative to the 
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control ammal indicates the test compound is a modulator of latency of the disorder or 
syndrome. 

In yet another aspect, the invention includes a method for determining the pres^ce of 
or predisposition to a disease associated with altered levels of a NOVX polypeptide, a NOVX 
nucleic acid, or both, in a subject {e.g., a human subject). The method includes measuring the 
amount of the NOVX polypeptide m a test sample from the subject and comparing the amount 
of the polypeptide in the test sample to the amount of the NOVX polypeptide present m a 
control sample. An alteration in the level of the NOVX polypeptide in the test sample as 
compared to the control sample indicates the presence of or predisposition to a disease in the 
subject. Preferably, the predisposition includes, e.g., diabetes, metabolic disturbances 
associated with obesity, the metaboHc syndrome X, anorexia, wasting disorders associated 
with chronic diseases, metabolic disorders, diabetes, obesity, infectious disease, anorexia, 
cancer-associated cachexia, cancer, neurodegenerative disorders,; Akhehner's Disease, 
Parkinson's Disorder, immune disorders, and hematopoietic disorders. Also, the expression 
levels of the new polypeptides of the invention can be used in a method to screen for various 
cancers as well as to determine the stage of cancers. 

In a &nher aspect, the invention includes a method of treating or preventing a 
pathological condition associated with a disorder in a mammal by administering to the subject 
a NOVX polyp^tide, a NOVX nucleic acid, or a NOVX-specific antibody to a subject {e.g., a 
human subject), in an amount sufficient to alleviate or prevent the pathological condition. In 
preferred embodiipents, the disorder, includes, e.^., diabetes, metaboUc disturbances 
associated with obesity, the metabolic syndrome X, anorexia, wastmg disorders associated 
with chronic diseases, metabolic disorders, diabetes, obesity, infectious disease, anorexia, 
cancer-associated cachexia, cancer, neurodegenerative disorders, Alzheimer's Disease, 
Parkinson's Disorder, immune disorders, and hematopoietic disorders. 

hi yet another aspect, the invention can be used in a method to identity the cellular 
receptors and downstream effectors of the invention by any one of a number of techniques 
commonly employed in the art. These include but are not limited to the two-hybrid system, 
afBnity purification, co-precipitation with antibodies or other specific-interacting molecules. 

Unless otherwise defined, all technical and scientific terms used herein have the same 
meaning as commonly understood by one of ordinary skill in the art to which this invention 
belongs. Although methods and materials similar or equivalent to those described herein can 
be used in the practice or testing of the present invention, suitable methods and materials are, 
described below. All publications, patent applications, patents, and other references 
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mentioned herein are incorporated by reference in their entirety. In the case of conflict, the 
preset specification; including definitions, will control. In addition, the materials, methods, 
and examples are illustrative only and not intended to be limiting. 

Other features and advantages of the invention will be apparent fiom the following 
detailed description and claims. 

DETAILED DESCRIPTION OF THE INVENTION 

The present invention provides novel nucleotides and polypeptides encoded thereby. 
Included in the invention are the novel nucleic acid sequences and then- polypeptides. The 
sequences are collectively referred to as "NOVX nucleic acids" or *T^Oyx polynucleotides" 
and the corresponding encoded polypeptides are referred to as "NOVX polypeptides'* or 
''NOVX proteins." Unless indicated oth^se, "NOVX" is meant to refer to any of the novel 
sequences disclosed herein. Table A provides a summary of thi NOVX nucleic acids and 
their encoded polypeptides. Example 1 provides a description of how the novel nucleic acids 
were identified. 



TABLE A, Sequences and Corresponding SEQ ID Numbers 



NOVX 
Assignment 


Internal Identification 


SEQ ID 

NO 
(nucleic 
acid) 


SEQ ID NO 
(polypeptide) 


Homology 


1 


AP001404 A 


I 


2 


Leupin 


2 


Ba380|>16 A 


3 


4 


Jtaterferon 


3 


29145493 EXT 


5 


6 


Tyrosine Kinase Receptor 


4 


GM 95074063 A 


7 


8 


Chloiide conductance 


5a 


GM 83554525 A 
(CG5469^01) 


9 


10 " 


5-hydro3rylj:yptaniine (sffitotonin) 
. receptor 
Serotonin Receptor 


5b 


(CG54692-01) 


11 


12 


6a 


21639300 EXT 


13 


14 


Ssdivary Ciland Protein 


6b 


CG51622^02 


15 


16 


(Von Ebner) Salivary Gland Protein 


7 


GM 51624520 A 


17 


18 


CD-81 


8a 


27479850 EXTl 


19 


20 


SHD 


8b 


CG51761-02 


21 


22 


SHD 


9 


AI284055 EXT 


23 


24 


Hepatama-Derived Growth Factor 


10 


95073892 EXT- 
REVCOMP 


25 


26 


Salt-Inducible Protein KJnasc 



NOVX nucleic acids and their encoded polypeptides are useful in a variety of 
applications and contexts. The various NOVX nucleic acids and polypeptides according to the 
invention are use&l as novel members of the protein famiUes according to the presence of 
domains and sequence relatedness to previously described proteins. Additionally, NOVX 
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nucleic acids and polypeptides can also be used to identify proteins that are members of the 

family to which the NOVX polypeptides belong* 

For example, NOVI is homologous to members of SCCA family of proteins that are 

important protease inhibitors and cancer antigens. Thus, the NOVI nucleic acids, 

5 polypeptides, antibodies and related compounds according to the invention will be useful in 

therapeutic and diagnostic applications in disorders characterized by protease inhibition and 

carcinoma, e.g., squamus cell carcinoma of, for example, cervix, head and neck, lung, and 

esophagus. 

Also, NOV2 is homologous to the interferon family of proteins. Thus N0V2 nucleic 
10 acids, polypeptides, antibodies and related compounds according to the invention will be 

usefiil in therapeutic and diagnostic applications in disorders characterized by e.g., 

hj^erprolif^ation, e.g., cancer, neurologic disease, immune disorders, and viral infection. 
Further, N0V3 is homologous to a family of tyrosine kinpse-like receptor proteins 

important in cell prolifemtion and differentiation. Thus, the N0V3 nucleic acids and 
15 polypeptides, antibodies and related compounds according to the invention will be useful in 

therapeutic and diagnostic Explications in developmental and proliferative disorders, e.g. 

angiogenesis, cell signaling disorders, cancer, fertility disorders, reproductive disord^s, 

tissue/cell growth regulation disorders* 

Also, N0V4 is homologous to the chloride channel family of proteins important in 
20 chloride ion transport. Thus, N0V4 nucleic acids, polypeptides, antibodies and related 

compounds according to the invention will be useful in therapeutic and diagnostic applications 

in various disorders, including, for example, cystic fibrosis, congenital myotonia. Dent 

disease, an X-linked renal tubular disorder, leukoencephalopathy, mahgnant hyperthermia, and 

hypertension. 

25 Additionally, N0V5a and N0V5b are homologous to the serotonin receptor family of 

proteins. Thus NOV5 nucleic acids, polypeptides, antibodies and related compounds 
according to the invention will be useful in treating a variety of conditions, including, e.g., 
seizures, Alzheimer's disease, sleep disorders, appetite disorders, thermoregulation, pain 
perception, hormone secretion and sexual behavior, mental depression, migraine, epilepsy, 

30 obsessive-compulsive behavior (schizophrenia), drug addiction, and affective disorders. 

Also, N0V6 is homologous to the salivary gland-like, or lipocalin family of proteins. 
Thus N0V6 liucleic acids, polypeptides, antibodies and related compounds according to the 
invention will be usefW in therapeutic and diagnostic ^phcations in various disorders, 
includmg, for example,, olfactory disorders, salivitory disorders, digestive disorders, oral 
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immunologic disorders, poor oral health, inflammatory processes in the airways due to 
allergy/asthma, emphysema or viral infection, cystic fibrosis, and obesity. 

Further, N0V7 is homologous to members of the tetraspaimin family of proteins. 
Thus, the NOV? nucleic acids, polypeptides, antibodies and related compounds according to 
the invention will be useful in therapeutic and diagnostic applications in disorders 
characterized by inflammation, e.g., asthma, arthritis, psoriasis, and inflammatory bowel 
disease. 

Still further, NOV8 is homologous to a family of src homology domain-containing 
pr6teins that are important in a variety of functions, including signal transduction. Thus, 
N0V8 nucleic acids and polyp^tides, antibodies and related compounds according to the 
invention wiU be useful in therapeutic and diagnostic applications in disorders characterized 
by altered signal transduction, e.g. cancer, lymphoproliferative syndrome, cerebral palsy, 
qjilepsy, and other and/or other pathologies and disorders. | 

N0V9 is homologous to the hepatoma-derivcd growth factor (HDGF) family of 
proteins. Thus, N0V9 nucleic acids and polypeptides, antibodies and related compounds 
according to the invention will be useful in therapeutic and diagnostic applications in various 
disorders including, for example, cell proliferation disorders, development disorders, and - 
nephrogenesis. 

Finally, NOVIO is homologous to tiie salt-inducible kmase family of proteins that are 
important in adrenocortical functions, Thus, NOVIO nucleic acids and polypeptides, 
antibodies and related compounds according to the invention will be useful in therapeutic and 
diagnostic applications in various disorders, e.g. adrenoleukodystrophy, kidney disease, 
atherosclerosis, and inflammation* 

The NOVX nucleic acids and polypeptides can also be used to screen for molecules, 
which inhibit or enhance NOVX activity or ftinction* Specifically, the nucleic acids and 
polypeptides according to the invention may be used as targets for the identification of small 
molecules that modulate or inhibit, ^.g., neurogenesis, cell differentiation, cell proliferation, 
hematopoiesis, wound healing and angiogenesis. 

Additional utilities for the NOVX nucleic acids and polypeptides according to the 
invention are disclosed herein. 

NOVl 

A NOVl sequence according to the invention includes a nucleic acid sequence 
encoding a polypeptide related to the leupin family of proteins, A NOVl nucleic acid is found 



8 



wo 01/74851 PCT/USOl/10039 

on human chromosome 1 8. A NOVl nucleic add and its encoded polypeptide includes the 
sequence shown in Tables 1 A-IB. A disclosed NOVl nucleic acid of 1200 nucleotides is 
shown in Table lA, and is identified as SEQ ID N0:1. The disclosed NOVl open reading 
frame (^*ORF') begnis at the ATG initiation codon at nucleotides 7-9, shown in bold in Table 
1 A. The encoded polypeptide is alternatively referred to herein as NOVl or as AP001404__A. 
The disclosed NOVl ORF terminates at a TAA codon at nucleotides 1 192-1 194, As shown in 
Table lA, putative untranslated regions 5' to the start codon and 3' to the stop codon are 
underlined, and the start and stop codons are in bold letters. 
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Table lA. NOVl nucleotide sequence (SEQ ID NO:l). 



ATGATCQTCATAAAAACATATTTTTCTCTCCCCMAQCCTCTCy^GCTCKCCTTGG 
TGCTAGAAOTGACAGTCKACATCAGATTGATGAGGTAGTACACTT 

AAAC3AACCTGCTGGGTCCTTAAAr&ATYi&i^&rir*nnanrrrtr^n*/-t7*rt^ 




^- . * * .^mutn,^ X vin 1 A A\SAUTAT'i\iUCAACAGGCTTTATGGAGAG 

AATCTQTGAGGAATACOTAQATGGTGTGATTCAATTTTACCACACGA 

AAAAACCCnXSAAAAATCO^ACAAGAGATTAACTTCTGGGTTGAATGTCAATCCCA^ 

ACCTCTTCAGCTUySGACGCTATTAATGCTQAGACTGTaCTO^ 

CAAATQGOAAACATACTOTGACCATGAAAACACGGTGgaTGaVCCT^ 

AAGAGTOTGAAaATGATGACGCAAAAAGGCCTCTACAGAATTCGCTTCATAGAGGAG^ 

TCCTGGAAATGAGGTACSVCCAAGGGGAAGCTCAGCATaTTCGTGCTGCTGCCATCTCACT^ 

CCTGAAGGGTCrGGAAGAGCTTQAAAGGAAAATCACCTATGAAAAAATGQT^ 

AACATGTCAGAAGAATCGGTGGTCCTGrrCCrrCCCCCGGTTCACCCTGGAAGACAGCTAl^ 

COVTTTTACAAGACATGGGCATTACGGATATCTTTGATGAAACGAGGGCTGATCTTACTGG^ 

AAGTCCC^TTTGTACTTaTCAAAAATTATCCACAAAACCTTTGT^ 

GCAGCTGCAGCCT^CTOGQQCTGTTGTCTCQGAAAGGTCACTACGATCTTGGGTGQA^ 

ACCCTTTTCTCrTTTTCATTAQACACAACAAAACCCAAACCATTCTOT 
TTA AAAQGG(g 



A disclosed encoded NOVl protein has 395 amino acid residues, referred to as the 
NOVl protein. The NOVl protein was analyzed for signal peptide prediction and ceUular 
localization. SignalP results predict that NOVl is likely to be locaUzed in the microbody 
(peroxisome), with a certainty of 0.5007. The disclosed NOVl polypeptide sequence is 
presented in Table IB using the one-letter amino acid code. 



Table IB. Encoded NOVl protein sequence (SEQ ID NO:2), 



WSLVTj^KFCFDLFX3BIGKDDRHKKriFFSPLSLSAALGiv ^^ 
PAGSLNNSSGLVSCYF^Ll^fCLDRllCrDYTLSlA^ 
PEKSRQElWETgVECX3SQGKIKDLFSKDAlNA^TVLVLVWAVYFI^^ 
VKIMQKGLYRIQFIEWKAQILEMRYTKGKLSMWLLPSHSKDmKGLEELERKITY^ 

SEEgWLSFPRFTIiEDSYDX*NSILQr»«GlTDlFDETRADLTGlSPSPNLYLSKIIHKTFVEVDENQTQ^ 
AATGftVVSERSLRSWVEPNA^FLFFIRIflJKJQTILFYQRVGSP 



NOVla was initially ideaitified on chromosome 18 with a TblastN analysis of a 
proprietary sequence file for leupin or a homolog, which was run against the Genomic Daily 

9 
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Files made available by GenBank or from files downloaded from the individual sequencing 
centers. The nucleic acid sequence was predicted from the genomic file GenBankl AP001404 
by homology to a known Leupin or homolog. Exons were predicted by homology and the 
failron/exon boundaries were determined using standard genetic rules. Exons were further 
selected and rejSned by means of similarity determination using multiple BLAST (for 
example, ffllastn, BlastX, Blastn) searches, and, in some instances, GenScan and Grail. 
Expressed sequences from both public and proprietary databases were also added when 
available to further define and complete the gene sequence. The DNA sequence was then 
manually corrected for apparent inconsistencies thereby obtaining the sequences encocling the 
full-length protein. 

A region of the NOVl nucleic acid sequence has 515 of 789 bases (65%) identical to a 
1284 nucleotide sequence coding for Homo sapiens squamus cell carcinoma antigen 2 mRNA 
(SCCA2), with an E-value of L2e (GENBANK-E): HSU195S7|acc:U19557). Also, in a 
search of public sequence databases, it was found, for example, that the NOVl nucleic acid 
sequence disclosed in this invention has 435 of 447 bases (97%, E = 8.6e-^ identical to an 
IMAGE clone (SoaresJsIhHMPu^Sl Homo sapiens cDNA clone IMAGE;668321 5* similar to 
SW:SCC2_mjMAN P48S94 squamous cell carcinoma antigen 2) (GENBANK-ID: 
AA242969). The strong (97%) homology of a 435 base pair segment of the current invention 
with 447 base pair region of this 555 bp RNA GenBank sequence suggests that the cmrent 
invention represents an expressed gene sequence. Public nucleotide databases include all 
GenBank databases and the GeneSeq patent database. 

In all BLAST alignments herein, the "E-value" or 'TSxpecf ' value is a num<dc 
indication of the probability that the aligned sequences could have achieved their similarity to 
the BLAST query sequence by chance alone, within the database that was searched. For 
example, the probability that the subject ("Sbjct'O retrieved from the NOVl BLAST analysis, 
e,g,, Homo sapiens squamus cell carcinoma antigen 2 mRNA, matched the Query NOVl 
sequence purely by chance is L2xl0"^^. The Expect value (E) is aparameter that describes the 
number of hits one can "expect" to see just by chance when searching a database of a 
particular size. It decreases exponentially with the Score (S) that is assigned to a match 
between two sequences. Essentially^ the E value describes the random background noise that 
exists for matches betwem sequences. 

The Expect value is used as a convenient way to create a significance threshold for 
reporting results. The default value used for blasting is typically set to O.OOOL In BLAST 2.0, 
the Expect value is also used instead of the P value (probability) to report the significance of 
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matches. For example, an B value of one assigned to a hit can be interpreted as meaning that 
in a database of the current size one might expect to see one match with a similar score simply 
by chance. An E value of zero means that one would not expect to see any matches with a 
similar score simply by chance. See, e,g., 

http://www.ncbi.nhn.nih.gov/Education/BrASTinfoA OccasionaDy, a string of X's or Ws 
will result from a BLAST search. This is a result of automatic filtering of the query for low- 
complexity sequence that is perfoimed to prevent artifactual hits. The filter substitutes any 
low-complexity sequence that it finds with the letter ''N" in nucleotide sequence (e.g., 
'NNNNNNNNNNNN^ or the letter '»X" in protem sequences (e.g., "XXXXXXXxk"). 
LoW"Complexity regions can result in high scores that reflect compositional bias rather than 
significant position^by-position alignment. Wootton and Federhen, Methods Enzymol 
266;554-571, 1996. 

A BLASTX search was performed against public protein jdatabases. The disclosed 
NOVl protein (SEQ ID N0:2) has good identity with a number of leupin-like proteins. For 
example, (he fiill amino acid sequence of the protein of the invention was found to have 196 of 
395 amino acid residues (49%) identical to, and 270 of 395 residues (68%) positive with, the 
390 amino acid squamus cell carcmoma antigen 2 (SCCA-2, leupin) protein from Homo 
sapiens (ptnriSWISSPROT-ACC: P48594, E- 4.8 e-93). Public amino acid databases include 
the GenBank databases, SwissProt, PDB and PIR. 

Other BLAST results include sequences from the Patp database, which is a proprietary 
database that contains sequences published in patents and patent pubUcations. Patp results 
include those listed in Table 1 C. 



Table IC. Patp alignmeuts of NOVr 



sequences producing High-scoring Secfinent Pairs: 



Patp:Y25927 Human SCCA2 protein ~ Homo aapiena, 390 aa* 
Patp:Wl5242 Psoriastatin type II - Homo sapiens, 390 aa. 
Patp.-B25276 SCC antigen ^ Synthetic, 390 aa. 
Patp:Y320 77 Hepatitis B virua receptor SCCRl Homo sapiens 



lieading 
Frame 



Smallest 
Sum 

High Prob. 
Score P(N) 



+1 
+1 
+1 
+1 



932 8.0e-93 

928 2.1e-92 

910 1.7*^90 

910 1.7e-90 



For example, a BLAST against patp: Y25927, a 390 amino acid SCCA2 from Homo 
sapiens, produced good identity, E = 8.0e"^^). 



11 



wo 01/74851 PCTAJSOl/10039 
The disclosed protein is also similar to the leupin-Uke proteins in Table ID. 



Table ID. BLAST results for NOVl 


Gene Index/ 
Identifier 


Protein/ Organisiti 


Length 
(aa) 


Identity 
(%) 


(%) 




Giil710877|splP4859 
41SCC2 HUMAN 
(X89015), (1719557) 
(C^i9576), 
(AB035089) 


SQOAMOUS CELL 
CARCINOMA ANTIGEN 
2 (SCCA-2) 
(LEUPIN) 
Homo s&piens 


390 


181/396 
(45%) 


252/396 
(62%) 


2e-85 


Gi|2118384|pirl 1138 
202 


leupin precursor 
Homo aapXens 


390 


181/396 
(45%) 


252/396 
(62%) 


3e-85 


Gi 12118383 Ipiirf (138 
201 


SquAmons cell 
carcinoona antigen 
1 

Homo sapiens 


390 


179/396 
(45%) 


252/396 
(63%) 


4e-83 


Gi|11720871gblAAA86 
317,11 (U19568); 
(U19S56) 


Squamous cell 

carcinoma 
antigen^l Homo 
sapiens; serine 
(or cysteine) 
proteinase 
inhibitor, clade 
B (ovalbumin) 
msffiber 3 


390 


179/396 
(45%) 


252/396 
(63%) 


4e-83 



A ClustalW analysis comparing disclosed proteins of the invention witii related leupin 
5 protein sequences is given in Table IE, with NOVl shown on line 1 . 

In tibe ClustalW alignment of the NOVl protein, as well as all other ClustalW analyses 
herein, the black outlined amino acid residues indicate regions of conserved sequence 
regions that may be required to preserve structural or functional properties), whereas non- 
highlighted ammo acid residues iare less conserved and cm potentially be mutated to a much 
1 0 broader extent without altering protein structure or function. 

The NOVl protein has significant homology to leupin-like proteins. 

Table IE. ClustalW Analysis of NOVl 

1) Novel NOVl (SBQIDN0;2) 

. c 2) gi|1710877lsptP48594tSCC2 SQUAMOUS CELL CARCINOMA AKTIGBN 2 (SCCA^2) (LBUPIN) (SEQ ID NO:27) 

15 3 gi|2n 8384|pir||I382021eupm precursor (SEQ ID NO:2S) 

4) gt|2U8383lpitllI38201 squamous cell carcinoma antigen i (SEQIDNO:29) 

5) gi|59O2072M[NP_O08850,l| serine (or cysteine) proteinase inhibitor, cladc B (ovalbumin), incmbcr 3; SCCA-I (SBQ ID 



KOVl PRT 
cri 11710877 1 
gi 1 2X18384 1 
gi 1 2118383 1 
gi 1 3902072 1 



NOVl PRT 
gi|17lOB77| 
1 2116384 1 
gi 1 2118383 1 
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10 



15 



gi 15902072 1 



NOVi PR!P 
gi 1 17X0877 1 
gi 1 21183841 
gi 1 2118383 1 
gi|S902072| 



HOSa tRT 
gi 11710877 1 
gi 1 2118384 1 
gi 1 2116383 1 
gi 1 5902072 1 



NOVl PRT 
gi 1 1710877 1 
gil 2118384 I 
gi 1 21183831 
gi 1 59020721 



HOVl PRV 

gil 1710877 1 
gi 1 21183841 
git 2118383 1 
gi 1 59020721 



HOVl PRT 
gi) 1710877 1 
gil 21183841 
gi 1 2118383} 
gil 59020721 



130 



140 




The presence of identifiable domains in NOVl, as well as all other NOVX proteins, 
was determined by searches using software algorithms such as PROSITE, DOMAIN, Blocks, 
Pfam, ProDomain, and Prints, and then detennining the Inteipro number by crossing the 
domain match (or numbers) using the Interpro website Oittoiwww.ebi.ac.uk/ interp roV 
DOMAIN results, e.g., for NOVl as disclosed in Table IF, were collected from the Conserved 
Domain Database (CDD) with Reverse Position Specific BLAST analyses. This BLAST 
analysis software samples domains found in the Smart and Pfam collections. For Table IE 
and all successive DOMAIN sequence ahgnments, fully conserved single residues are 
indicated by black shading and "strong" serai-conserved residues are indicated by grey 
shading. The ^strong" group of conserved amino acid residues may be any one of the 
following groups of amino acids: STA, NEQK, NHQK, NDEQ, QHRK, MILV, MILP, HY, 
FYW. 

Table IF lists the domain description from DOMAIN analysis results against NOVl. 
The region from amino acid residue 13 througji 395 (SEQ ID NO:2) most probably (E = 3e"^^) 
contains a "SERPIN" (Serme proteinase mhibitor) domain, aUgned here with the 360 amino 
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acid SERPIN (Smart database), Pfam 00079 (SEQ ID N0:31). This indicates that the NOVl 
sequence has properties similar to those of other proteins known to contain this domain. 
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Table IF. Domain Analysis of NOVl 

gnllSmartiaERPIN , SERitie Proteinase INhibitora 

CD-Length = 360 residues^ 100,0% aligned 

Score = 341 bits (875), Expect « 3e-95 



NOVl MDl 

Maa I pf «wiO 0079 DS S R-AL I 



NOVl 

Pfam|pfaa00079 



NOVl agi 
Pfaa|pfam00D7d 



NOVl fSPj 
Pf amtpf amOOO? 9 



NOVl 

Pfan|p£aia0OO7& 




190 200 

NOVl HfflF-SKDA-r^" 
Pfam|p£am00079 mL-K— 0 



NOVl 

Pfam|pfBmD0079 

310 320 330 

NOVl Vi^SSS--mSE;£S|^Vp£^j^^ 



430 




The reprbsentative mOTiber of the SERPIN family is shown in Table IF. The family 
contains 58 sequences, including SCCA and many serine protease inhibitors. 

Barnes and Worrall described the cloning of a member of the serpin family of serine 
protease inhibitors by degenerate PGR and screening of a HeLa cell cDNA library. Barnes 
and Worrall, F^BS Lett. 373: 61-65, 1999. The isolated cDNA encodes a 390-amino acid 
protein, designated leupin, that is 91 .8% identical to SCCAl , The authors stated that the 
reactive site of leupin differs from SCCAl in the active loop region, including the presence of 
a leucine residue rather than a serine at the P(l) position within the loop region that acts as a 
pseudo-substrate for the target protease. Barnes and Worrall speculated that leupin may be a 
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cysteine protease inhibitor, and that Hxe isoelectric point is consistent with the acidic fonn of 
SCCA associated with squaaius cell carcinomas. Barnes and Worrall, supra. 

The squamous cell carcinoma antigen (SCCA) is a member of the ovalbumin family of 
serine proteinase inhibitors (seipins). The protein Was isolated from a metastatic cervical 
squamous cell carcinoma by Kato and Torigoe, Cancer 40:1621-1628, 1977 (See, e.g., Online 
Mendelian Inheritance in Man (OMIM), available at http;//www.ncbi.nhD.nih,gov/., entry 
600517 and 600518). SCCA is detected in the superficial and intermediate layers of normal 
squamous epithelium, whereas the mRNAs is detected in the basal and subbasal levels. The 
clinical import of SCCA has been as a circulating tumor marker for squamous cell carcinoma, 
especially those of the cervix,^ head and neck, lung, and esophagus. The squamous cell 
carcinoma antigen (SCCA) serves as a serological marker for more advanced squamous cell 
tumors. Many clinical studies of cervical squamous cell carcinoma show that the percesntage 
of patients with elevated circulating levels of SCCA increases mm ^proximately 12% at 
stage 0 to more lhan 90% at stage IV. Levels fall after tumor resection and rise in 
approximately 90% of the patients with recurrent disease. Similar trends occur in the other 
types of squamous cell carcinoma, with a maximum sensitivity of approximately 60% for 
lung, 50% for esophageal, and 55% for head and neck tumors. The neubal form of SCCA is 
detected in ttie cytoplasm of nonnal and some malignant squamous cells, whereas the acidic 
form is expressed primarily in maHgnant cells and is the m^or form found in the plasma of 
cancer patients. Thus, flie appearance of the acidic fraction of SCCA is correlated wth more 
aggressive tumors. 

In an analysis of chromosomal aberrations involving human chromosome band 18q21, 
Silverman et al.(Silverman, et al., Genomics 9:219-228, 1991) identified a DNA fragment, 
A56R (D18S86), that contained a 56/57-bp match with the published cDNA sequence of 
SCCA (Suminami et al., Biochem. Biophys. Res. Commun. 181:51-58, 1991). Schneidw et 
al. (Proc. Nat. Acad. Sci. 92:3147-3151. 1995) showed that this fiagment contained exon 3 of 
a new gene, SCCA2 (OMIM- 600518), which was 92% identical to SCCAl . SCCAl and 
SCCA2, which map wlttiin 1 8q21 ,3, are tandemly airayed and flanked by two members of the 
ovalbumin family of serine protemase inhibitors, plasminogen activator inhibitor type 2 
(PAI2; OMIM-173390) and maspin (protease inhibitor 5; PI5; OMlM-154790). The predicted 
pi values and molecular weights of the cDNAs suggested that the neutral and acidic forms of 
the SCCA were encoded by SCCAl and SCCA2, respectively. Analysis of the primary amino 
acid sequences shows that both genes are members of the high molecular weight serpin 
superfamily of serine protemase inhibitors. 
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Although SCCAl and SCCA2,are nearly identical in piimaiy structure, the reactive 
site loop of each mhibitor suggests that they may differ in their specificity for target 
proteinases, SCCAl has been shown to be effective against papain-like cysteine proteinases, 
Schick et al demonstrated that SCCA2 inhibits ^e chymotrypsin-^like proteinases cathepsin G 
(OMIM-1 16830) and mast cell chymase (0M1M4 1 8938) in vitro, Schick, et al, J. Biol 
Chem. 272:1849-1855, 1997. SCCA2 was ineffective against papain-like cysteine 
proteinases, which have been shown to be i^bited by SCCAl (OMIM 600518). 

The nucleic acids and proteins of NOVl are useful in potential therapeutic q)pUcations 
implicated in various leupin- or seipin-related pathologies and/or disorders. For examjile, a 
cDNA encoding flie leupin-like protein may be useftd in gene therapy, and the leupin-like 
protein may be useful when administered to a subject in need thereof The novel nucleic acid 
encoding NOVl protein, or fi^agments thereof, may further be useftd in diagnostic 
appKcations, wherein the presence or amount of the nucleic acid jcwr the protein are to be 
assessed. These materials are further useful in the gencrotion of antibodies that bind 
immunospecifically to the novel substances of the invention for use in therapeutic or 
diagnostic methods. The NOVX nucleic acids aad proteins are useful in potential diagnostic 
and therapeutic applications implicated in various diseases and disorders described below 
and/or other pathologies. The NOVl nucleic acids and protems are useful in therapeutic 
appUcations implicated in, for example, connective tissue remodeling; Alzheimer's Disease; 
hypertension; cardiac hypertrophy; coronary heart disease, squamous cell carcinoma, 
especially those of the cervix, head and neck, lung, and esophagus, and/or , Other pathologies 
and disorders. 

For example, a cDNA encoding the leupin-Uke protein may be usefiil in gene therapy, 
and the Leiq)in-like protein may be useful when administered to a subject in need thereof By 
way of nonlimiting example, the compositions of the present invention will have efficacy for 
treatment of patients suffering from connective tissue remodeling; Alzheimer's Disease; 
hypertension; cardiac hypertrophy; coronary heart disease, squamous cell carcinoma 
(especially those of the cervix, head and neck, lung, and esophagus). The novel nucleic acid 
encoding leupin^-like protein, and the leupin-like protein of the invention, or fragments thereof, 
may fiirtherbe useful in diagnostic applications, wherein the presence or amount of the nucleic 
acid or the protein are to be assessed* 

Furdier, the protein similarity information, expression pattern, and m^ location for 
NOVl suggests that NOVl may have important stoictural and/or physiological functions 
characteristic of the SCCA family. Therefore, the nucleic acids and proteins of ttie invention 
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are useful in potential diagnostic and therapeutic applications and as a research tool. These 
include serving as a specific or selective nucleic acid or protein diagnostic and/br prognostic 
marker, wherein the presence or amount of the nucleic acid or the protem are to be assessed, as 
well as potential tiierapeutic applications such as the following: (i) a protein therapeutic, (ii) a 
small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, drug 
targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene deliveiy/gene 
ablation), and (v) a composition promoting tissue regeneration in vitro and in vivo (vi) 
biological defense weapon. 

These materials are further usefiil in the generation of antibodies that bmd imniuno- 
specifically to the novel NOVl substances for use in therapeutic or diagnostic methods. These 
antibodies may be generated according to methods known in the art, using prediction from 
hydrophobidty charts, as described in the "Anti-NOVX Antibodies'* section below. The 
disclosed NOVl protein has multiple hydrophiUc regions, each 0f which can be used as an 
immunogen. In one embodiment, a contemplated NOVl epitope is from about amino acids 10 
to 30, In another embodiment, a NOVl epitope is from about amino acids 50 to 75. In 
additional embodiments, NOVl epitopes are from amino acids 90 to 125, 130- 160, 180-200, 
and from amino acids 260 to 280. These novel proteins can be used in assay systems for 
functional analysis of various human disorders, which will help in understanding of pathology 
of the disease and development of new drug targets for various disorders, 

NOV2 

A novel nucleic acid was identijBed on chromosome 9 by TblasflST using CuraGen 
Corporation's sequence file for interferon or homolog as run against the Genomic Daily Files 
made available by GenBank or from files downloaded from the individual sequencing centers. 
The nucleic acid sequence was predicted from the genomic file Scq Ctr 
ACCNO:sggc_draftJ>a380pl6_20000326 by homology to a known interferon or homolog. 
Exons were predicted by homology and the intron/exon boundaries were determined usmg 
standard genetic rules. :&cons were further selected and refined by means of similarity 
determination using multiple BLAST (tBlastn, BlastX, Blastn) searches, and, in some 
instances, Genscan and Grail. Expressed sequences ftom both pubUc and proprietary 
databases were also added when available to fiuther define and complete the gene sequence. 
The DNA sequence was then manually corrected for apparent inconsistencies thereby 
obtaining the sequences encoding the full-length protein. The novel nucleic acid of 695 
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nucleotides (ba380pl6_A, SEQ ID N0:3) encodiag a novel intefferon-like protein is shown in 
Table 2A. 



Table 2A. NOV2 Nucleotide Sequence (SEQ ID NO:3) 

AAAATGGTATTATTAGAACAGGATTTCCAGTTCGGACTCGGTCCCCrrCCTGGTGGCCCTGCTGCTT 

ACTGTGGCCCTGTTGGATCTCTGGGCtTTGACCTGCCTCAGAACCATGGCCTACTTAGCAG^ 

GGCTCTTCTGGGCCAAATGCAGAGAATCTCCCCTTTCTTGTGTerCAAGGAOWSAAGAGACTTCA^ 

CCCCTTTTTTTTGTTGATGGCAGCCAGTTGGATAAGGCCCyVGGCCCTGTCTGTCCTCCA^ 

AGCAGATGTTCAGCGTCTACCGCACAGAGTGCTCCTCTGCTGCCTGGAACATGACCCTCCTGGACCAGCT 

CCACACTGGATTTCATCTGTATCTAGGATGCCTGGAGTC^CGCTTAGGGCAGGCAATAGGAGAGGAAGAA 

TCTGTAGGGGTGATTGTGGCCGCTACACTGGCCTTGAGGAGGTACTTCCAGGGAATCCATGGAATCCAGA 

GAATCTACCTGAAAGAGAAGAAATACAGTGAGTGTGCTTGGGAGGTTCTCAGAGrGGGAATCATGAAATC 

C*]?TCTCTTCATCAACAAACTTGCAAGGACTGAGAAGTAAGGATGAAGACCTGGGGTCTGCTTTAGTCTTT 

CTTATTTTCTTCCTCTTCCTTACTATGTGTTTATTCCTTCTrTTTCTAGTTCCT.aaACTTGTAAA 

In a search of public sequence databases, it was found, for example, that the nucleic 
acid sequence has 622 of 673 bases (92%) identical to a 3659 bp synthetic omega 4-interfcron 

10 mRNA (GENBANK-ID: A12146|acc:A12146) (E = 2.6 e-122). It was also found, for 

example, that the nucleic acid sequence of the invention has 233 of 244 bases (95%) identical 
to Homo sapiens interferon genes LeIF-L, LeIF-J, and pseudogene LeIF-M located on 
chromosome 9 (9937 bp, GENBANK-ID: HSIFDl|acc:V0053 1, E = 2.2&-42). The strong 
(95%) homology of a 243 base pair segment of the current invention with 244 base pair region 

15 of the above GenBank sequence suggests that the current inventidn represents an expressed 
interferon gene and polypeptide. Public nucleotide databases include all GenBank databases 
^d the GeneSeq patent database. 

An open reading frame was identified beginning with an ATG initiation codon at 
nucleotides 4-6 and ending with a TAA codon at nucleotides 685-687. A putative untranslated 

20 region upstream from the initiation codon and downstream from the termination codon is 

underlined in Table 2A, and the start and stop codons are m bold letters. The disclosed NOV2 
polypeptide (SEQ ID NO:4) encoded by SEQ ID N0:3 is 227 ammo acid residues and is 
presented using^the one-letter code in Table 2B. The N0V2 protein was analyzed for signal 
peptide prediction and cellular localization. SignalPep results predict that N0V2 is cleaved 

25 between position 29 and 30 of SEQ ID N0:4, i.e., at the slash in the amino acid sequence 

VGS-LG. Psort and Hydropathy profiles also predict that N0V2 contains a signal pq)tide and 
is likely to be localized at the plasma membrane (certainty of 0*9190). 
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Table 2B. Encoded NOV2 protein sequence (SEQ ID NO:4). 

LFFVDGSQiaKAQMiSVLHEMLCK^lFSVYPTECSSAAWNMTLLDQiaTGFHIiYLGCLESRI^ 

VGVIVAPTlAlJlRYFQGIiiGIQRIYLKEia^ySDCAWEVLRVGIMKSFSSSTNLQGLRS^^ 

IFFLFLTMCLrLLFLVP 



The fiill amino acid sequence of the protein of the invention was found to have 139 of 
195 amino acid residues (71%) identical to, and 153 of 195 residues (78%) positive with, the 
195 amino acid residue interferon omega-1 precursor (interferon alpha-H-l) protein from 
Homo sapiens (ptnr: SWISSPROT-ACC:P05000) (E = 9,2e-65). Public amino acid databases 
include the GenBank databases, SwissProt, PDB and PIR. 

As shown in Table 2C, Patp analysis shows that N0V2 has significant homology with 
a number of interferons. Interferons (IFN) produce antiviral and antiproliferative responses in 
cells. Interferons are classified into five groups, all of them related but gamma-IFN. 

^1 . 



Table 2C. Patp alignments of NOV2 


Sequences producing High-scoring Segment Paira: 




Smallest 








Sum 


Reading 


High 




Prob. 


Frame 


Score 


vm 


Patp:P60253 Interf eron-omega-X - H, sapiens, 195 aa. +1 


665 


1 


. 6e-64 


Patp:Y22635 Hunvan interf eron^omega protein ^ H. sapiens. +1 


665 


1 


.6e-64 


Patp!Bi3433 Human interferon omega - H. sapiens, 195,., +1 


665 


1 




Patp:P60355 Sequence of human leucocyte interferon +1 


657 


1 


.le-63 



For example, a BLAST against patp: Y22635, a 195 amino acid interferon omega 
protein from Homo sapiens, produced 139/195 (71%) identity, and 153/195 (78%) positives (E 
=^ 1.6e-64), See, PCT application WO 99/26663, describing human interferon^omega and 
constructs and vectors containing interferon-omega. The compositions containing the 
constructs are used in human or veterinaiy medicine for treating a wide variety of cancers, 
particularly melanoma, glioma, and ovarian carcinoma (also metastases to lung and liver), or 
pancreatic, gastric, colonic, and mesenteric cancers. The proteins listed in Table 2C show 
long segments of amino acid identity, as shown by the vertical lines (|) in Table 2D. 
Conservative substitutions are mdicated by a plus sign (+). 



Table 2D: Alignment of NOV2 Y22635 Human interferon omega 

(SEQn)NO:32) 

Length ^195 Plu3 Strand HSPs: 

Score « 665 (234.1 bits), Expect - 1.6e-64, P = 1.6e-64 

Identities ^ 139/195 (71%), Positives » 153/195 (78%), Fraiae ^ +1 

N0V2; 37 LGPLLVALLLCHCGFVGSLGHT^LPQNHGLLSRNTLRLLGQMQRISP^ 216 

I III illlll lllllllllillll 11 ll+llllllillllltllll 

IFK: 4 liFPLLAfeLVHTSYSPVGSI^CDLPQNHGLLSRNTLVXLHQMRRISPFLClJCDRKDFR^ 63 
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N0V2: 217 FFVDGSQLHKAQALSVLHEMLQQIFSVYPTECSSAAWNMTLLDQLHTGFHLYLGCLESRL 396 

I itii i! +111111111111++ II iiiiiiniiiiiiii I I 11+ I 

IFN: 64 EMVKGSQLQKAHVMSVLHEMLQQIPSLE^TERSSAAWNMTLLDQLHT^^ 123 
^30V2: 397 GQAIGEEESVGVIVAPTIAIRRYFQGIHGIQRIYLKEKKYSDCAWEVLRVGIMKSFSSST 576 

' ^" N M +1 ( IKKlli l + ||l!li!i!fli|| + t+ lill II 

IFN: 124 LQWGEGESAGAISSPALTLRRYFQGI^ RVYLKEKKYSDCAWEWRMEIMKSLHliST 179 

N0V2: 577 NLQG-LRSKDEDI.GSA €21 

1+1 mil I1II+ 

IFN; 180 NMQERLRSKDRDLG3S 195 



Other BLAST results including the sequences used for ClustalW analysis is presented 
in Table 2E. 
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Table 2E. BLAST results for NOV2 


Gene Index/ 
Identifier 


Protein/ 
Organism 


Length 
(aa) 


Identity 


Positives 
. (%) 


Expect 


gii45D4605|ref |NP 0 
02168.11 


Interferon 
(IFN) f omega 
1 Hmo 
sapiens 
(INTERFERON 
ALPHA-II-1) 


195 


133/182 
(73%) 


145/182 
(79%) 


le-63 


git 386800 1 gb|AAA527 
24.11 CM11003) 


IFN-alpha 
Homo sapiens 


195 


132/182 
(72%) 


144/182 
(78%) 


8e-63 


gil 758083 |emb|CAA26 
501.11 {X02669) 


IFN omega 
precursor 
Homo sapiens 


174 


129/178 
(72%) 


141/178 
(78%) 


7e-61 


gil847816(gb|AAA700 
31.11 (U25670) 


IFN omega- 1 
Homo sapiens 


174 


126/175 
(72%) 


137/175 
(78%) 


3e-59 


gil 124502 |sp|P05OO2 
1 IN02_H0RSE 


IH^-oinega-2 
precursor 
^ Equus 
cabsllus 


195 


117/181 
(64%) 


136/181 
(74%) 


2e-53 



This information is pres^ted graphically in the multiple sequrace aUgument given in 
Table 2F (with NOV2 being shown on Une 1) as a ClustalW analysis comparing N0V2 with 
related protein sequences* 



Table 2F, Information for the ClustalW proteins; 

1) NOV2(SEQ rDN0:4) 

2) gil4504605trefINP_002l68a| interferon, omega 1 CSEQIDNO:33) 

3) gif386800|gb|AAA52724.1| (Mn003)idterfcron-aIpha (SEQIDNO;34) 

4) gi|758083|cmb|CAA2650Ii|(X02669)humm interferon orae (SEQIDN0:35) 

5) gi|8478i6|gb|AAA70091.1I (i;25670) interferon ome^ 

6) gill24502lsplP05002|mO2_HORSE INTERFERON 0MEGA-2PRBCURS0R(INTC (SEQ1DN0:37) 



10 20 30 40 50 60 

. 1 I i i 1 1^ — ■ — ~i — : : 


N0V2 MVLLEQDrQFG 


i ES!'!SA:,i^:.^4i«:l.^?:(:rElL(;r-l,l>(.:5lM;Li ^r.m^'^ i:r'..... ...... . 
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70 



80 



90 



100 



. 1 . 



.1. 



110 



120 



NOV2 



-JOLrlT'JL 
)OT,]iTGL 



130 



140 



150 



160 



N0V2 



.1. 



170 



180 



gi 1 124502 1 



190 



200 



210 



. 1 . 



220 



NQV2 



LVFLI FFLFLTMCIiFLliFLVP 
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DOMAIN results for N0V2 were collected from the Conserved Domain Database 
(CDD) with Reverse Position Specific BLAST. This BLAST samples domains found in the 
Smart and Pfam collections, N0V2 showed significant alignment with Pfam 00143 
(interferon, Interferon alpha/beta domain. E- 6e-57) and Smart EF^ (Interferon alpha, beta 
and delta, 117 amino acid residues E ^ 8e-31). The alignment with Pfam00143 is shown in 
Table 2a The similarity of NOV2 wifli the Interferon alpha/beta domain indicates that the 
NOV2 sequence has properties similar to those of other proteins known to contain this domain 
as well as to the interferon domain itself 



Table 2G. Domain Analysis of NOV2 

gnl|Pfaittipfam00143, interferon, Interferon alpha/beta domain 
CD-Length = 190 residues, 91.6% aligned 

Score ^ 213 bits (543), Expect ^ 6e-57 



NOV2 

GnllSnnrtllFabd 



N0V2 G( 
Gull Smart lIP»bd Nl 



NCV2 vdME! 




70 



SO 90 100 

IVGVI V APTjAfflgMiGllHGI Ql 

jPii-pR nthbhB|fBH|bOq- 



110 



120 



Type I int^ferons (for example, IFN-alpha, IFN-beta, and DPN-omega) bind to the type 
I interferon (IFN) receptor and eUcit signaling events including activation of the Jak/Stat and 
IRS pathways (OMIM: 602376). Henco et al. (J Mol Biol. 185:227-260, 1985) compiled 
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partial maps of tie interferon gehe cluster located on 9p21. These maps showed that members 
of the two main families of genes in the IFN superfamily, interferon-alpha (OMIM447660) 
arid IFN-omega, are interspersed. Olopade et al (Genomics 14:437-443, 1992) studied the 
deletions of the short arm of chromosome 9 frequently observed in acute lymphoblastic 
leukemia and in gliomas. These deletions often include the entire interferon gene cluster, 
which comprises about 26 IFN-alpha, IFN-omega, andlKN-beta 1 (OMIM-147640) genes, as 
well m the gene formethylthioadenosinephosphoiylase (MTAP; OMIM-1 56540). By 
comparing microscopic deletions with the genes lost at the molecular level, Olopade et al, 
determined the order of these genes on 9p to be: 

tel" IFN-beta 1 - IFN-alpha/IFN-omega cluster-MTAP-cen, 
In a few cell lines and in primary leukemia cells, they observed deletions that had 
breakpoints within the interferon gene cluster and resulted in partial loss of the interferon 
genes. Hiese partial deletions allowed them to determine the o^cr of some genes or groups of 
genes in the IFN-alpha/IFN-omega gene cluster. From their deletion analysis, Olopade et al. 
deduced the followng order of the interferon gene on 9p: 

pter-- IFN-beta 1 -( lEN-omega 1, IKN-alpha 21)-- IFN-omega P15- IFN-alpha 4- 
IFN-omega 9^- IFN-alpha 7- IFN-alpha 10- IFN-omega P18- IFN-alpha P16- IFN-alpha 
17- IFN-alpha 14-( IFN-alpha 22, v5, IFN-alpha P20, IFN-alpha 6, IFN-alpha 13, IFN-alpha 
2)"( IFN-alpha 8, IFN-omega 2, IFN-omega P19, IFN-alpha l)-MTAP-cen, 

The genes v^thin the large linkage group are arranged in tandem with their 3-prime 
end pointing toward the telomere of the short arm. Thus, at least two functionaLmterferon- 
omega genes, IFN-omega 1 and IFN-omega 2, were mapped and several interferon-omega 
pseudogenes, (a^., lEN^megaPlS) were localized. 

Apart from their antiviral activities interferons also possess antiproliferative and 
immunomodulating activities and influence the metabolism, growth and differentiation of ceUs 
in many different ways. 

Omega-^Interferon (IFN-omega) is a natural component of human leukocyte uaterferon 
(LeIFN). This interferon is called aboIFN-alpha HI , It displays a high degree of homology 
with various IFN-alpha species including positions of the cysteine residues involved in 
disulfide bonds. However, sequence divergence allows classification as a unique protein 
family. IFN-omega binds to the same receptors as IFN-alpha and IM To date the exact 
biological activities and the physiological role of this interferon are unknown. It is thought to 
influence cell pnolifcsration and differmtiation. One related protein is bovine trophoblast 
protein-1 (TP-1), which is produced in large quantities during pregnancy, and is a potent 
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antiviral, antiproliferative and immmosuppressive agent. See, generally, 
http://www.copewithcytokines.de, 

Mire-Sluis et al describe bioassays for IFN-alpha, IFN-beta and IFN-omega that 
exploit the abiUty of these factors to inhibit proliferation of TF-l cells (a human premyeloid 
cell line) induced by GM-CSR Mire-Sluis, et al., J. of Immunol. Meth. 195:55-61, 1996. The 
bioassays can be used also with Epo and TF-l cells, or Epo and Epo-transfected UT-7 cells. 

The nucleic acids and proteins of the invention are useful in potential thcr^utic 
applications imphcated in various interferon-related pathological disorders, described further 
below. For example, a cDNA encoding the interferon -like protein may be useful in ^ene 
thOT^y, and the interferon -like protein may be use&l when administCTed to a subject in need 
thereof. By way of nonlimiting example, the compositions of the present invention will have 
efficacy for treatment of patients suflfering &om hyperpioliferative disraders, viral or other 
pathogenic infection, immune disorders, and disorders of the nc^oendocrine system. 

For example, the nucleic acids and proteins of the invention are useful in potential 
therapeutic applications implicated in viral infections; neurologic disease, cancer (especially 
acute lymphoblastic leukemia and m gliomas, malignant melanoma; non-Hodgldn's 
lymphoma, squamous cell carcinoma); immune disorders; and/or other pathologies and 
disorders including their immunother^y. Thus, a cDNA encoding the interferon-like protem 
may be useful in gene therapy, and the interferon-like protein may be useful when 
administered to a subject in need thereof. By way of nonlimiting example, the compositions 
of the present invqtitipn=will have efficacy for treatment of patients suffering ftom viral 
infections; cancer especially acute lymphoblastic leukemia and in gliomas, neurologic disease; 
and/or immune disorders. 

The novel nucleic acid encoding the interferon-like protein of the invention, or 
fragments thereof, may further be useful in diagnostic applications, wherein tiie presence or 
amount of flie nucleic acid or the protein are to be assessed. These materials arc further useful 
in the generation of antibodies that bind immunospecifically to the novel substances of the 
invention for use in therapeutic or diagnostic metiiods. These antibodies may be generated 
according to methods known in the art, using prediction from hydrophobicity charts, as 
described m the "Anti-NOVX Antibodies" section below. The disclosed N0V2 protein has 
multiple hydrophilic regions, each of which ean be used as an immunogen. In one 
embodiment, a contemplated N0V2 qjitope is from about amino acids 40 to 50. In another 
embodiment, a N0V2 epitope is from about amino acids 55 to 65. In additional embodiments, 
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NOV2 epitopes are ftom amino acids 75 to 85, and from amino acids 150 to 200. These novel 
proteins can also be used to develop assay system for functional analysis. 

These novel proteins can be used in assay systems for functional analysis of various 
human disorders, which will help in understanding of pathology of the disease and 
development of new drug targets for various disorders. 

NOV3 

N0V3 is a novel Receptor Tyrosine Kinas&-like protein and nucleic acid encoding it. This 
sequence was initially id^tified by searching CuraGen*s Human SeqCalling database for 
DNA sequences that translate into piioteins witii similarity to a protein family of interest 
SeqCalling assembly 29145493 was identified as having suitable similarity. SeqCalling 
assembly 29145493 was analyzed further to identify an open reading frame encoding for a 
novel fhll length proteta and novel splice forms of this gene. Th|s was done by extending the 
SeqCalling assembly using suitable additional SeqCalling assembUes, pubUcly available EST 
sequences and public gaiomic sequence. Public ESTs and additional CuraGen SeqCalling 
assemblies were identified by the Curatools program SeqExtend. They were included in the 
DNA sequence extension for SeqCalling assembly 29145493 only when suflScient identical 
overlap was found. These inclusions are described below. The genomic clone AC023225 
(chromosome 1) was identified as having regions with 100% identity to the SeqCalling 
assembly 29145493 and were selected for analysis because this identity implied that the clone 
AC023225 contained the sequence of the genomic locus for SeqCalUng assembly 29145493. 
The genomic clone AC023225 was analyzed by Genscan and Grail to identify exons and 
putative coding sequences/open reading frames. The clone AC023225 was also analyzed by 
TblastN, BlastX and other homology programs to identify regions translating to proteins with 
similarify to the original protein/protein family of interest. 

The results of these analyses were integrated and manually corrected for apparent 
inconsistencies, thereby obtaining the sequence encoding the fiilHength protein. When 
nec^sary, the process to identify and analyze cDNAs/ESTs and genomic clones was reiterated 
to derive the full-length sequence. N0V3 describes this full-length DNA sequence(s) and the 
fulHength protein sequ^ce(s) which they encode. 

The novel nucleic acid of 3003 nucleotides (29145493_EXT, SEQ ID N0:5) encoding 
a novel tyrosine kinase-Uke protein is shown in Table 3A, An open reading frame (ORF) was 
identified beginning with an ATG initiation codon at nucleotides 1-3 and ending with a TGA 
codon at nucleotides 3001-3003. In Table 3A, the start and stop codons are in bold lettere. 
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Table 3A. N0V3 Nucleotide Sequence (SEQ ID NO:5) 

ATGGTATTGACAACTGCTATACCAGCCTGGCTTCTTAGCTGTTCCCTCCCACTCTCATCCTGGGCCC^ 
ATGCGACACCOCCCCTCCGTCTAGTAGTTATCCTCCTGGATTCCAAAQCCTCCCAGGCCGAGCTGGGCTG 
GACTGCACTGCCAAGTAATGGGTGGGAGGAGATCAGCGGCGTGGATGAACACGACCGTCCCATGCG^^ 
TACCAAGTGTGCAATGTGCTGGAGCCCAACCAGGACAACTGGCTGl^GACTGGCTGGATAAGCCGTGGCC 
GCGGGCmGCGCATCTTCGTGGAACTGCAGTTCACACTCCGTGACTGCAGCAGCATCCCTGGCGCCG 
-TACCTGCAAGGAGACCTTCAACGTCTACTACCTGGAAACTGAGGCCGACCTGGGCCGTGGGCGTCCCCGC 
CTAGGCGGCAGCCGGCCCCGCAAAATCGACACGATCGCGGCGGACGAGAGCTTGACGCAGGGCGACCTGG 
GTGAGCGCAAGATGAAGCTGAACACAGAG6TGCGCGAGATCGGACCGCTCAGCCGGCGGGGTTTCCACCT 
GGCCTTTCAGGACGTGGGCGCATGCGTGQCGCTTGTCTCGGTGCGCGTCTACTACAAGCAGTGCCGCGCG 
ACCGTGCGGGGCCTGGCCACGTTCCCAGCCACCGCAGCCGAGAGCGCCTTCTCCACACTGGTGGAAGTGG 
CCGGAACGTGCGTGGCGCACTCGGAAGGGGAGCCTGGCAGGCCCCCACGCATGGACTGCGGCGCCGACGG 
CGAGTGGCTGGTGCCTGTGGGCCGCTGCAGCTGCAGCGCGGGATTCCAGGAGCGTGGTGACTTCTGCGAA 
TGTCCCCCAGGGTTTTACAAGGTGTCCCCGCGGCGGCCCCTCTGCTCACCGTGCCCAGAGGACAGCCGGG 
CCCTGGAAAACGCCTCCACGTTCTGCGTGTGCCAGGACAGCTATGCGCGCTCACCCACCGACCCGCCCTC 
GGCTTCOTGC^GCCGTCCGCCGTCGGCGCCGCGGQACCTGCAGTACAGCCTGAGCCGCTCGCCGCTGGTG 
CTGCGACTGCGCTGGCTGCCGCCGGCCGACTCGGGAGGCCGCTCGGACGTCACCTACTCGCTGCTGTGCC 
TGCGCTGCGGCCGCGAGGGCCCGGCGGGCGCCTGCGAGCCGTGCGGGCCGCGCGTGGCCTTCCTACCGCG 
CCAGGCAGGGCTGCGGGAGCGAGCCGCCACGGTGCTGCACCTGCGGCCCGGCGCGCGCTACACCGTGCGC 
GTGGCCGCGCTCAACGGCGTCTCGGGCCCGGCGGCCGCCGCGGGAACCACCTACGCGCAGGTCACCGTOT 
CCACCGGGCCCTCAGCGCCCTGGGAGGAGGATGAGATCCGCAGGGACCGAGTGGAACCCCAGAGCGTGTC 
GCTGTCGTGGCGGGAGCCCATCCCTGCCGGA6CCCCTGGGGCCAATGACACGGAGTACGAGATCCGATAC 
TACGAGAAGCAGAGTGAGCAGACTTACTCCATGGTGAAGACAGGGGCGCCCACAGTCACCGTCACCM 
TGAAGCCGGCTACCCGCTACGTCTTTCAGATCCGGGCCGCTTCCCCGGGGCCATCCTGGGAGGCCCAGAG 
TTTTAACGCGAGCATTGAAGTACAGACCCTGGGGGAGGCTGCCTCAGGGTCCAGGGACCA6AGCCCCGCC 
ATTGTCGTCACCGTAGTGACCATCTCGGCCCTCCTC6TCCTGGGCTCCGTGATGAGTGTGCTGGCCATTT 
GGAGGAGGAGGCCCTGCAGCTATGGCAAAGGAGGAGGGGATGCCCATGATGAAGA6GAGCTGTATTTCCA 
CTGTAAAGTCCCAACACGTCGCACATTCCTGGACCCCCAGAGCTGTGGGGACCTGCTGCAGGCTGTGCAT 
CTGTTCGCCAAGGAACTGGATGCGAAAAGCGTCy^CGCTGGAGAGGAGCCTTGGAGGAGGCAAGTTTGGGG 
AGCTGTGCTGTGGCTGCTTGCAGCTCCCCGGTCGCCAGGAGCTGCTCGTAGCCGTGCACATGCrGAGGGA 
CAGCGCCTCCGACTCACAGAGGCTCGGCTTCCTGGCCGAGGCCCrCACGCTGGGCCAGTTTGACCATAGG 
CACATCGTGCGGCTGGAGGGCQTTGTmCCCGAGGTAGGACCTTGATGATTGTCACCGAGTACATGAGCC 
ATGGGGCCCTGGACGGCTTCCTCAGGCACGAGGGGCAGCTGGTGGCTGGGCAACTGATGGGGTTGCTGCC 
TGGGCTGGCATGAGCCATGAAGTATCTGTCAGAGATGGGCTACGTTCACCGGGGCCTGGCAGCTCGCCAT 
GTGCTGGTCAGGAGCGACCTTGTCI'GCAAGATCTCTGGCTTCGGGCGGGGCCCCCGGGACCGATCAGAGG 
CTGTCTAOiCCACTGGCCGGAGCCCAGCGCTATGGGCCGCTCCCGAGACACTTCAGTTTGGCCACTTCAG 
CTCTGCCAGTGACGTGTGGAGCTTCGGCATGATCATGTGGGAGGTGATGGCCTTTGGGGAGCGGCCTTAC 
TGGGACATGTCTGGCCAAGACGTGAAGGCTGTGGAGGATGGCTTCCGGCTGCCACCCCCCAGGAACTGTC 
CTAACCTTCTGGACCGACTAATGCTCGACTGCTGQCyVGAAGGACCCAGGTGAGCGGCCCAGGTTCTCCCA 
GATCCACAGCATCCTGAGCAAGATGGTGCAGGACCCAGAGCCCCCCAAGTGTGCCCTGACTACCTGTCCC 
AGGCCTCCCACTCCACTAGCCGACCGTGCCTTCTCCACCTTCCCCTCCTTTGGCTCTGTGGGCGCGTGGC 
TGGAGGCCCTGGACCTGTGCCGCTACAAGGACAGCTTCGCGGCTGCTGGCTATGGGAGCCTGGAGGCCGT 
GGCCGAGATGACTGCCCAGGACCTGGTGAGCGTAGGCATCTCTTTGGCTGAACATCGAGAGGCCCTCCTC 
AGCGGGATCAGCGCCGTGCAGGCACGAGTGCTCCAGCTGCAGGGCCAGGGGGTGCAGGTGTGA 



The disclosed 29145493_EXT nucleic acid sequence has that the nucleic acid sequence 
has 735 of 121 1 nucleotides (60%) identical to Kinase 1 Mus musculus (GENBANK- 
ID:MMKIN1). 

The disclosed N0V3 polypeptide (SEQ ID N0:6) encoded by SEQ ID N0:9 is 1000 
amino acid residues and is presented using the one^etter code in Table 3B, The first 70 amino 
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acids of the disclosed N0V3 protein were analyzed for signal peptide prediction and cellular 
localization. SignalP results predict that N0V3 is cleaved between position 22 and 23 of SEQ 
ID N0:6, I.e., at the slash in tiie amino acid sequence SWA-HH, Psort and Hydropathy 
profiles also predict that N0V3 contains a signal peptide aad is likely to be localized at the 
plasma membrane (certainty of 0.4600). 
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Table 3B. Encoded N0V3 protein sequence (SEQ D) NO:6). 



MVLTTAIPAWLLSCSLPLSSWA/HHATmS^ 

YQVCKVI^PNQDNWLQTGWISRGRGQRIFVBLQET'LRDCSSIPGAAGTCKETEIIVYYLETEADLGRGRPR'- 

LGGSRPRKIDTIAADESETQGDLGERKMKLNTEVREIGPLSRRGFHIAFQDVGACVALVSVRVYYKQCRA 

TVRGIATFPATAAESAFSTLVEVAGTCVAHSEGEPGSPPRMHCGADGEWIiVPVGRCSCSAGFQERGDFCE 

CPPGFYKVSPRRPIiCSPCPEHSRALSNASTFCVCQDSYARSPTDPPSASCTRPPSAPRDLQYSLSRSPLV 

IJU»RWI*PPADSGGRSDVTY3LLCLRCGREGPAGACEPCGPRVAFLPRQAGLRERAATI»LHLRPGARYTVR 

VAAIiNGVSGPflAAAGTTYAQVTVSTGPSAPWEEDEIRRORVBPQSVSLSWREPIPAfiAPGANDTE^ 

YEKQSEQTYSMVKTGAPTVTVTNUCPATRYVFQIRAASPGPSWEAQSFNPSIEVQTLGEAASGSRDQSPA 

IWTVVTISALLVLGSVMSVIAIWRRRPCSYGKGGGDAHDEEELYFHCKVPTRRTFLDPQSCGDLLQ^^ 

LFAKELDAKSVTLERSLGGGKFGELCCGCLQLPGRQELLVAVHMLRDSASDSORLGrLAEALTl^ 

HIVRLEGVVTRGRTIJ«lVTEYMSHGALIX5FLRHEGQLVAGQIiiGLL 

VI,VSSDLVCKISGFGRGPRDR3EAVYTTGR3PAI.WAAPETLQFGHFSSASDVW3FGIIMWEVMAF6ERPY 

WDMSGQDVKAVEDGFRLPPPRNCPNLLHRIJ«LDCWQKDPGERPRFSQIHSILSKMVQDPEPPKCAL 

RPPTPLADRAFSTFPSFGSVGAWLEALDLCRYKDSFAAAGYGSLEAVAEMTAQDLVSLGISLAEHREALL 
SG3CSALQARVLQLQGQQVQV 
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A BLASTX search was performed against pubhc protein databases. The full amino 
acid sequence of the protein of the invention was found to have 537 of 1000 amino acid 
residues (53%) identical to, and 717 of 1000 residues (71%) positive with, the 993 amino acid 
residue tyrosine kinase receptor protein from Gallusgallus (ptnr:SPTREMBL-ACC:042422 
EPH-LDCE RECEPTOR TYROSINE KINASE PRECURSOR (EC 2 J.1,112) (TYROSINE- 
PROTEIN KINASE RECEPTOR CEPHA7), SEQ ID NO:39 (E = 1.8 e^^^^). These proteins 
have large regions of identity, as shown in Table 3C. For example, the region from N0V3 
amino acids 148 to 181 has a stretch of 34 identical amino acids. 
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Table 3C, Alignment of NOV3 with 042422 (SEQ ID NO:39). 



score ^ 2779 [578*3 bits), Expect = 1. Seizes, P m a.8e-288 

Identities 537/1000 (53%), Positives ^ 721/1000 (72%), Frame =• +1 

N073: 1 MVLTTAIPAHn^LSCSLPLSSWAHHATPPLRLWILLPSKASQAELGWTALPSNGWEEISG 60 

1 1 1 + +1 !++ 1 1+ I +1! tiiiiiii Ml I + I imiiii 

042422 1 MVLRSW^PPWlMLCSWLI^FAHTGEAQAAKmLXDSKAQQTELEWISSPPNGWEElSG 60 
N0V3: 61 VDEHDRPJRTYQVCNVLBPNQDNWLQTGWISRGRGQRIFVELQFTLRDCSSIPGAAGTCK 120 

n.o>,oo "^""^ iiiiim i+i ii+iti+i iiiiiii+Miiii+i+i( nil 

042422: 61 LDENYTPIRTYQVCQVMESNQNNWLRTNWlAKSNAQRirVELKFTLRDCNSLPGVLGTCK 120 
N0V3: 121 ETFN\nfYLETEADlX3RGRPRL6GSRPRKlDTlAADESFTQGDLGERKMi™T 180 

iiii^ii IK Ml + 4-+ imiimiitiiiimmMmimii 

042422 J 121 BTFNLYYYETDYDTGRN IRENQYVKIDTIAADESFTQGDLGERIMKI^TEVREIGPL 177 

N0V3: 181 SRRGFHLAFQDVGACVALV3VRVYYK0CRATVRGIATFPATAAE3AF3TLVEVAGTCVAH 240 

i++ii+iiiiimi+iiiii4.||||+i + + II II , , ii+iiii IIII+ 

Q42422: 178 SKKGrYIiAFQDVGAClALVSVKVyYKKCWSIIENLAIFPDTVTGSEFSSLVEVRGTCVSS 237 
N0V3: 241 SEGEPGSPPRMHCGADGEWljVPVGRCSCSAGFQERGDFCE-CPPGrYKVSPRRPLCSPCP 299 

n.o.-o^ ' ^ i+iiiiii+i+i I ii+i++n II I nil I + 11)1 

042422: 238 AEEEaVEN3PKMHCSAEGEWLVPIGKClCKAGYQQKGDTCEPCGRGFYKSSSQDL0C8RCP 297 
N0V3: 300 EHSRJa.ENASTFCVCQD3yARSPTDPPSASCTRPPSAPRDLQySLSR3PLVLRLRWLPPA 359 

" ^ 1+ M+iii i+i+iii +iitiim++i ++++++, + I I III 

042422: 298 THSrSDKEGSSRCPCEDSYYRAPSDPPYVACTRPPSAPQNI.IFNINQT-^-rTVSLEWSPPA 355 

N0V3; 360 DSGGRSDVTYSLLCLRCGREGPAGACEPCGPRVAFLPROAGLRERAATIiLHLRPGARYTV 419 

l-^-ll l + illl +IMI I Mill + + I II + l-H. t III 
042422: 3S6 DNGGRNDVTYRIl£KRCSWE--QGECVPC6SNlGYMPQC?rGLVDNYVTVMDLIJU^ 413 

N0V3I 420 RVAALNGVSGPAAAAGTTYAQVTVSTGPSAPWEEDEIRRDRVEPQSVSL8WREP1PAGAP 479 

^.o.o. M+tiii ++ +t I+++11 +11 + +++II +11 m+ii I 

042422: 414 ^VEAVNGVSD-LSRSQRLFAAV3ITTGQAAPSQVSGVMKERVI,QR3VELSWQEP-''-BHP 469 
NOV3: 480 0ANDTEYEiRYyEK-QSEC3TYSMVKTGAPTVTVl*l!Jl,KPATRYVEX}IRAASPGPSWEAQSF 538 

iiitt+iiiM i+iiMii + + ++ mil !miii + ++ 

042422: 470 NGVITEYEIKYYEKDQRERTYSTVKTKSTSASINKLKPGTVYVFQXRAFTAAGYG KY 526 

N0V3; 539 NPSlEVQTLGEAASG--3RDQSPAIWTVVTISAI»LVLGSVMaVliAlWRRRPCSYGKGGG 596 

+ M+M 1 1 1 + I +1 + 1 1++ 1 1 ++ ++I ++ I 1 1 M I 

042422: 527 SPRLDVATLEEATATAVSSEQNPVIIIAWAVAGTIILVFMVFQFIIGRRH-CGYSKA-- 583 

N0V3:. 597..DAHDEEELYrHCKVPTRRTFLDPQSCGDLLQAVHLFAKELDAKSVTLERSLGGGKB-GEIiC 656 

I +11111111 +I++1 1++ I +1 I It I I I I I I + +1 I +11 + 11 I + I 
042422: 584 D0EGDEELYPHPKFPGTKTYIDPETYBDPNRAVHQFAKELDA3GIKIERVIGAGEPGEVC 643 

N0V3: 657 CGCLQLPGRQELLVAVHMLRDSASDSQRLGFLAEALTLGQFDHSHIVRIiEOVVXRGRTLM 716 
^.^.o^ 11 + 111++"++ 11+ 1+ ++II nil +11111 ++I 11111111+ +1 

042422; , 644 SGRLKLPGKRDVAVAIKTLKVGYTEKQRRDFLCEASIMGQFDHPNWHLEGWTRGKPVM 703 

N0V3: 717 IVTEYMSHGALDGFLR-'IIEGQLVAGQLMGLLPGLASAMKYLSEMGYVHRGLAARHVLVSS 775 

r..^.n^ ... " •'^•i^- t+ii++iiiiii mi++n+i 

042422: 704 iVIEYMgKGALDAFLRKHDGQFTVIQLVGMLRGIAAGMRYlADMGYVHRDLAARNiLVNS 763 
N0V3: 776 DLVCKISGFG— RGPRDRSEAVYTT— GRSPALWAAPETLQFGHFSSASDVWSFGIIEWK 831 

II I I iiiiii I+I ! Ill +1+ i+niini+ii+iii 

042422: 764 NI*VCKV3DFGL3RVIEDDPEAVYTTTGGKIPVRWTAPEAIQYRKFT8A3DVW3YG1VMWE 823 
N0V3: 832 VMAFGERPYWDMSGQDV-KAVEDGFRLPPPRNCPNLLHRLMLDCWQKDPGERPRF3QIHS 890 

^.o.o. ii++ii)inni in n+i+i+ni i +n n+innin+ in^+i n 

042422; 824 ^VMSYGERpyWDMSNQDVIKAIEB6YRI.PAPMDCPAGLHQIJ«LDCWQKERGERPKrEQIVG 883 
N0V3: 891 ILSKMVQDPEPPKCALTTCE'RPPTPLADRArSTFPSFGSVGAWLEALDLCRyKDSFAAAG 950 

«.o.oo n " ' Mill +n 1+ 1 +1 III n+i+ + iin+i in 

042422- 884 ILDKMIRNPNSLKTPLGTCSRPISPI.LDQNl*PDFTTFCSVGEffI.QAIKMERYKDNFTAAG 943 
N0V3; 951 YGSLEAVAEMTAQDLVSLGISLAEHREALLSGISALQARVLQLQGOGVQV 1000 

^.o.o. . . ' " +i++mi+i I++ ++I I ++I++I ( t i+ii 

042422: 944 YKSLESVAR MTIEDVMSLGITLVGHQKKIMSSICymRACaCJiLHGTGIQV 993 
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Table 3D. Palp allgnmeiits of NOV3 



Sequences producing High-scoring Segment Pairs: 



Reading 
Frame 



patp:R85092 EPH-like receptor protein tyrosine kinase , . . +l 

patp;W03421 Mouse developmental kinase 1 - Mus sp, 998aa. +l 

patp:Ra5090 EPH-like receptor protein tyrosine kinase ... +1 

patp:R75711 Eph-related PTK Cek4 - Gallua sp, 983 aa. +1 

patp:W83147 Rat receptor tyrosine kinase Ehk-1 - Rattus, +1 

patp;Rg5936 Protein tyroaine-kinaae bpTK7 - H. sapiens... +1 



Smallest 
Sum 

High Prob 
Score P(N) 



2768 
2762 
2395 
2320 
2307 
2269 



2.26-^287 
9.6e-287 
56-248 
66-240 
6ep238 
7e-234 



The disclosed N0V3 protem (SEQ ID NO:6) also has good identity with a number of 
olfactory receptor proteins, as shown in Table 3E. 

This information is presented graphically in the multiple ^equence alignment given in 
Table 3F (wiA N0V3 being shown on line 1) as a ClustalW analysis comparing N0V3 with 
related protein sequences. 



Table 3E. BLAST results for NOV3 


Gene Index/ 
Identifier 


Protein/ 
Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
<%) 


Expect 


Gi|2497572|sp|Ql5 
375iEPAV_HOMAN 


EPHRIN TYPE-A 
RECEPTOR 7 
PRECURSOR 
(TYROSINE- 
PROTEIN KINASE) 
Homo sapiens 


998 


513/998 
(51%) 


674/998 
(67%) 


0.0 


Gil 2462302 |emb|CA 
A74643.il 
(Y14271) 


Eph-like 
receptor 
tyrosine kinase 
Gallus gallus 


993 


513/993 
(51%) 


676/993 
(€7%) 


0»0 


Gi|2497573|splQ61 
772|EPA7_M0USE 


EPHRIN TYPE-A 
RECEPTOR 7 
PRECURSOR 
(TYROSINE- 
PROTEIN KINASE 
RECEPTOR EHK-3; 
EPH HOMOLOGY 
KINASE- 3; 
EMBRYONIC BRAIN 
KINASE; EBK; 
DEVELOPMENTAL 
KINASE 1; MDK- 

1) MU3 
IRUSCUIUS 


998 


512/998 
(51%) 


673/998 
(67%) 


0.0 


Gi|1706631isp|P54 
759tEPA7 RAT 
(021954) 


Ehk-3, fiill 
length form 

Rattv3 
norvegicus 


998 


510/998 
(51%) 


674/998 
(67%) 


0.0 


Gi|7434436|pirl U 
78843 {L36644) 


receptor 
protein- 
tyrosine kinase 
Hoiiio sapiens 


991 


452/961 
(47%) 


621/961 
(64%) 


0.0 
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Table 3F. Information for the ClustalW proteins: 



1) Novel NOV3(SEQ ID N0:6) 

2) gil4758282|fef|NP_004431.1]EphA7;Hckll;ephrinimBptO^ (SEQroNO:40) 

3) gi|81 34447|sp|042422|EPA7 Ephnn Txpe-A BteceptOr 7 htcursor (Tyrosmft-PK Receptor CcphaT) (CEKl 0 (SBD ID NO-4n 

4) gi|2497573iSpjQ61772|Epa7 Mouse Bpbnn Type^A Receptor 7 Precursor (Tyrosine-PK Receptor Ehk.3) (Eph Homology ' 
Kinase-3) CEn^ryonic Bramldna»)(BBK)(D©ve)QpmentiU Kinase ^ 

5) (gt|1706631|apjP54759lEPA7_RAT Ephrin Type-A Receptor 7 Precuraor (Tyrostne-Protein Kinase Receptor Ehk-3^ rSoh 
Homology Kinase-3) (SEQ ID NO:43) ^ 

6) giP'4344361pir||178843 rcteptdr protein-tyrosmc kinase - human (fragment ) (SEQ ID NO:44) 



NOV3 

gi 1 47582821 
pi 1 8134447) 
71124975731 
gl 1 1706631 1 
91174344361 



HOV3 

glt475B2&2| 
gl I 8134447 I 
gi|2497573J 
gi| 1706631 1 
gi|7434436| 



NQV3 

91147582821 
git 8134447 I 
gi| 2497573 1 
gl 1 1706631 1 
gl I 7434436 1 



N0V3 

gl 14758282 1 
gll 81344471 
gl 1 2497573 1 
gl 1 1706631 1 
g±t 7434436) 



NOV3 

gl) 4758282] 
gll 8134447 I 
gll 2497573 1 
gi 1 1706631 1 
gll 7434436 1 



N0V3 

gll 4758282 1 
gll 8134447 I 
gl 1 2497573 1 
gll 1706631 1 
gl| 74344361 



NOV3 

gll 4758282 I 
gll 8134447 1 
gi 1 24975731 
gi 117066311 
gl|7434436| 
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430 



480 




N0V3 

gl|47S8282t 
gi 1 8134447 1 
gi|2497573| 
gi|i706631| 
gi|7434436| 



H0V3 

gi I 47582821 
git 8134447 1 
gi 1 2497573 1 
gi I 1706631 I 
gi 1 74344361 

550 560 570 580 S90 600 

K0V3 SWEAQSpHs^yndMas G&t£||H^JMgaTBmTffigftmv«Gf«lilg\j 

gi 147582821 flpBttDnfliBiiSKMFK i rrnirM-^H 'I, Mi p.>sy^^B ^^yq 

gi 1 8134447 1 

gi 1 2497573 1 mt g^UBBBBEnBOgafZ lCMP 

gi 1 1706631 1 HNBBDSwBm^KMir 

gi 1 7434436 1 BvliflR|iEl8|FEiiyPV£^----->-fflSgi-aB^ 

610 620 630 640 - 650 660 

gi 1 4758282 1 
gi 1 8134447 I 
gif 24976731 
0i|17D6S31| 

gi 1 7434436 1 GiR-^^SKigPiW-|iKWTGHXQ 



HOV3 T] 
gi I 4758282 I 
gi 1 8134447 1 
gi I 24975731 
gi 1 1706631 1 
gi 174344361 



NOV3 

gi 1 4758282 1 
gi 1 8134447 1 
gi 1 2497573 1 
gi 1 1706631! 
gi I 7434436 1 



790 



800 



810 



8-30 



N0V3 

gi 1 4758282 1 
gi I 8134447} 
gi I 2497573 1 
gi 1 17066311 
gi 1 7434436 1 



NOV3 

gi 1 4758282 I 
gi 1 8134447 1 
gi 1 2497573 1 
gi 117066311 
gi 17434436 1 




900 
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NbV3 

gi 1 47582821 
gi 1 8134447 1 
gi 12497573 I 
9111706631 I 
gi 1 74344361 



KOV3 

gl 1 47582821 
gi 1 8134447 1 
gi|2497S73| 
gi 1 1706631 1 
gi 1 7434436 1 



KOV3 

gl 1 4758282 1 
git 8134447 1 
gl 1 2497573 1 
gi I 1706631 I _ 
gl I 7434436 1 PL- 




DOMAIN results for N0V3 were collected from the Conserved Domain Database 
(CDD) with Reverse Position Specific BLAST. This BLAST samples domains found in the 
Smart and Pfam collections. The N0V3 protein aligned with a number of related domains in 
both collections. 



Table 3G. Domain analysis for NOV3 


Gene index identifier 


Results 


Gnl|E'fam|pfam01404, EPH_lbd^ Ephrin 
receptor ligand binding domain 


CD-Length « 174 residues, 99.4% aligned 
Score ^ 301 bits (772), Expect - 8e-83 


GnllStaartlTyrKc, Tyrosine kinase^ 
catalytic domain; Phosphotransferases, 
Tyroslna-specific kinase subfamily* 


CD-Length = 257 residues, 100,0% aligned 
Score « 253 bits (645), Expect - 4e-6B 


GnlfPfani|pfain00069, pkinase, ;eukaryotlc 
protein kinase domain 


CD-Length - 256 residues, 97,3% aligned 
Score 162 bits <411) , Expect - 5e-41 


Gnl 1 Smart | S_TKc, Serine/Threonine 
protein kinases^ catalytic domain; 
phosphotransferases* Serine or 
threonine-specific kinase snbfamily. 


CD-^Length - 256 residues, 97-3% aligned 
score - 133 bits (334), Expect = 5e-32 


Gnl I Smart ISAM, Sterile alpha motif. 


CD-Length = 68 residues, 86,8% aligned 
Score « 65,1 bits (157), Expect - 2e-ll 


GnllPfam|pfam00536, SAM, SAM domain 
< Sterile alpha motif) 


CD-Length - 64 residues, 89.1% aligned 
Score = 59*7 bits (143). Expect = 7e-10 



N0V3 ^hows similarity with the Ephrin receptor ligand bmding domain, which is a 
type of tyrosine kinase. Also, N0V3 has similarity to the sterile alpha motif 

Ammo acids 33 tiirough 208 of N0V3 align with the 174 amino acid ephrin receptor 
ligand binding domain (SEQ ID NO:45), as shown in Table 3H. Amino acids 641 through 
892 align with amino acids 1 through 257 of the 257 amino acid tyrosine kinase cataJytic 
domain (SEQ ID NO:46), as shown in Table 31. Additionally, amino acids 925 toough 983 of 
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N0V3 align with amino acids 4 through 62 of the 68 amino acid sterile aJpha motif (SEQ ID 
NO:47), which is a widespread domain in signaling and nuclear proteins. la EPH^related 
tyrosine kinases, SAM appears to mediate cell-cell initiated signal transduction via the binding 
of SH2-containing proteins to a conserved tyrosine that is phosphoiylated. In many cases, 
SAM mediates homodimerisation. The aligmnent of N0V3 with the SAM domain is shown in 
Table 31 These similarities indicate that the N0V3 sequence has properties similar to those 
of other proteins known to contain these domains. 



Table 3H, Domain Analysis of NOV3 

Ephrln receptor ligand binding domain (SEQ ID NQ:45) 

10 20 30 40 50 60 

..I. ...(...,(., 

K0V3 Vlinn^CA§QAmnTALB-&» 
Gnl|F£affltpfa]&01404 BhtB^^^W^^DED^^^B"^^ 

70 80 90 IqO 110 120 

H0V3 sicRCra^g^j^^ ' — ' 

Gnl|2>£affltpfaiitO1404 P§pGii niOl Wllnmlmll^ VG 

130 140 150 ISO 170 180 

K0V3 — — — 

Gnl|V£aiiL^pfaaOi404 ifl 



H0V3 

gnl|P£aua|pfsaO1404 




Table 3L Domain Analysis of NOV3 

Tyrosine Kinase catalytic domain (SEQ ID NO; 46) 

10 20 30 40 50 60 

N0V3 I^E^JBjG^^ 

(ml j s&uu: 1 1 TyrKo V'Bi^^ I^EiBI^^ByH^^^fflpw^i^ 

70 90 90 XOO XIO 120 

N0V3 F-Dg$p|ffi^EfflvnRG-RT|gB^ 

Gzil I Smut I TyrKo L-Kp^^BtBcBBfi'-SE^QM^gEGlEltiD^BKMRPNB gs 

130 140 150 160 170 180 

K0V3 AGOn^SLLPG^SAffliffl^ ' 
6nl I Smart | Ty^Kc LSD^FALQ|BRG|^p3fQfB!QDBy|^CncrE^ DY 

190 200 210 220 230 240 

.... 1 .... I .... I .... 1 I .. . - 1.. I .... 1 .... 1 .... I .... I 
AVYT-T|a--S|ALn?^Bnrg9|lH^A 
6nl I Smart | TyrRc ¥KVK-G{^--l5vR|}}j^|gti(2K^jK^^BB^^^^B^ 

250 260 270 280 

..,.U...N,,.|..,,|....|....t...,K...l....|. 
NOV3 -(%)Sp{Qpip9--^i|j|^ 
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Table 3 J. Domain Analysis of NOV3 

sterile alpha motif domain (SEQ ID NO; 47) 


10 20 30 40 50 


60 
1 


KOV3 sBFpgfexsMfeftlM^DBcafcdls^^ 

....1... 

gnllSnartjSAM AloKBSEfi 


i 


S 
K 



Recent research has been directed to elucidating the developm^tal functions and 
biochemistry of Eph receptor tyrosine kinases and their membrane-bound ligands, ephiins. 
See, generally, Wilkinson, Int. Rev. Cytol. 196:177-244, 2000. The crystal structure of the 
5 amino -terminal ligand-bitiding domain of the receptor tyrosine kinase EphB2 (also known as 
Nuk) has been deteimined. Himanen, et al., Nature 396:486-491, 1988. The Eph receptors, 
which bind a group of cell-membrane-anchored ligands known 3^ ephrins> represent the largest 
subfamily of receptor tyrosine kinases (RTKs). They are predominantly expressed in the 
developing and adult nervous system and are important in contact-mediated axon guidiance, 
10 axon fasciculation and cell migration. Eph receptors are unique among other RTKs in that 
they fall into two subclasses with distinct ligand specificities, and in that they can themselves 
function as ligands to activate bidirectional cell-cell signaling. The N-tennmal domain folds 
into a compact jellyroU beta-sandwich composed of 1 1 antiparallel beta-strands. An extended 
loop that is important for ligand binding and class specificity has been identified. This loop, 
1 5 which is conserved within but not between Eph RTK subclasses, packs against the concave 
beta-sandwich surface near positions at which missense mutations cause signaling defects, 
localizing the Hgand-binding region on the surface of the receptor. 

EphA receptors bind to GPI-anchored ephrin-A ligands, while EphB receptors bind to 
ephrin-B proteins that have a transmembrane and cytoplasmic domain. Ephrin-B proteins 
20 transduce signals, such that bidirectional signaling can occur upon interaction with Eph 

receptor. In many tissues, specific Eph receptors and ephrins have complementary domains, 
whereas other family mecnbers may overlap in their expression. M important role of Eph 
receptors and ephrins is to mediate cell-contact-dependent repulsion. Complementary and 
overlapping gradients of expression underhe establishment of a topogr^hic map of neuronal 
25 projections in the retinotectal system. Eph receptors and ephrins also act at boundaries to 

channel neuronal growth cones along specific pathways, restrict the migration of neural crest 
cells, and via bidirectional signaling prevent intermingling between hindbrain segments. Eph 
receptors and ephrins can also trigger an adhesive response of endothelial cells and are 



33 



wo 01/74851 PCT/USOl/10039 
required for the remodeling of blood vessels. Biochemical studies suggest that the extent of 
multimerization of Eph receptors modulates the cellular response and that the actin 
cytoskeleton is one major target of the intracellular pathways activated by Eph receptors. Eph 
receptors and ephrins have thus emerged as key regulators of the repulsion and adhesion of 
cells that underlie the estabhshment, maintenance, and remodeling of patterns of cellular 
organization. 

The nucleic acids and proteins of the invention are useful in potential ther^eutic 
^plications implicated in various tyrosine kinase-related pathological disorders and/or ephrin- 
related pathological disorders, described further below. For example, a cDNA encoding the 
Wnase-like protein may be useful in gene th^apy, and the kinase -like protein may be usefiil 
when administered to a subject in need tihereof SeqCalling expression data and the expression 
of tyrosine kinase family members suggest that N0V3 is expressed in mammary tissue, breast 
cancer tissues, endothelial cells, and multiple embryonic and developmental tissues. 

By way of nonlimitmg example, the compositions of the present invention will have 
efficacy for keatment of patients suffering from various disorders, including, for example, 
angiogenesis, cell signaUng disorders, cancer, fertility disorders, tepioductive disorders, 
tissue/cell growth regulation disorders, developmental disorders and resulting disorders 
derived from the above conditions. Other kinase-related diseases and disorders are 
contemplated. 

The novel nucleic acid encoding the tyrosine ldnase«like protein of tihe invention, or 
fragments thereof, may further be useful in diagnostic appHcations, wherein the presence or - : 
amount of the nucldc acid or the protein are to be assessed These materials are further useful 
in the generation of antibodies that bind immunospecifically to toe novel substances of the 
invention for use in therapeutic or diagnostic methods. These antibodies may be generated 
according to methods known in the art, using prediction from hydrophobicity charts, as 
described in the "Anti-NOVX Antibodies" section below. For example, the disclosed N0V3 
protein has multiple hydrophilic regions, each of which can be used as an immunogen. The 
novel N0V3 protein can be used in assay systems for functional analysis of various human 
disorders, which will help in understanding of pathology of the disease and development of 
new drug targets for various disorders. 

NOV4 

The novel N0V4 nucleic acid was identified on chromosome 6 by TblastN using 
CuraGen Corporation's sequence file for chloride conductance regulatory or homolog as run 
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against the Genomic Daily Files made available by GenBank or from files downloaded from 
the individual sequencing centers. The nucleic acid sequence was predicted from the genomic 
jSle Sequencing Centef_nh0124i04 by homology to a known chloride conductance regulatory 
gene or homolog. Exons were predicted by homology ^d the intron/exon boundaries were 
determined using standard genetic rules. Exons were fiirttia: selected and refined by means of 
sfanilarity determination using multiple BLAST (for example, tBlasfN, BlastX, and BlasfN) 
searches, and, in some instances, GeneScan and Grail. Expressed sequences from both public 
and proprietary databases were also added when available to further define and complete the 
gene sequence. The DNA sequence was then manually corrected for ^yparent inconsistencies 
thereby obtaining the sequences encoding the fiiU-length protein. 

The disclosed nucleic acid of 742 nucleotides (designated GM_95074063_A, SEQ ID 
N0:7) encoding a novel chloride conductance regulatory -like protein is diown in Table 4A. 
An open reading frame was identified beginning with an ATG iiifitiation codon at nucleotides 
28-30 and ending with a TGA codon at nucleotides 724-726. A putative untranslated region 
upstream &om the initiation codon and downstream firtm the termination codon is underUned 
in Table 4A, and the start and stop codons are in bold letters. The encoded protein having 223 
amino acid residues is presented using the one-letter code in Table 4C (SEQ ID N0:8). 



Table 4A. NOV4 Nucleotide Seqoence (SEQ ID NO:7), 

TTAACATTGTGGTACATTGARAAATACA !rGCCGARTAGTTTCCTGCTACC^GAGCCAGCAGAGGGG(^ 

TGCAGCAGCAGCCAGACACCTiAGGCTGTGCTGAACAGQAAGGTCCTCCGCACTGGTACCCTTTATATCGC 

TGAGAGCCACCTGTCrrTGGTTAGATAGCfCTGGATTAGGATTCTCACTGGAATACC 

CTTGCATTATCCAGGGACGAAAGTGACTGTCTAQGAGAACATTTGTATGCTATGGTGAATGA 

AAGAATCCAAAGAATCTGTTGCTGATGAAGAAGAGGAAGACA6TGATGATGTTGAACTTATTACTGAATT 

TATATTTGTACCTAGTGATAAATCAGCACTGGGGGCAATGTTCACTGCAATGTGTGAATGCCAGGCCTTG 

CATCCAGATGCTGAGGATGAGGATGAGGATGACTACGATGGAGAAGAATATGATGTGGAAGCACATGAAC 

GAGGAAAAGGGGACATCCrTAAATCTTACACCTATGAAGGATTATCCCATTTAACAGCAGAAGGCCAAGC 

CACATTGGAGAGATTAGAAGAAATGCTTTCTCAATCTGTGAGCAGCCAGTATAATATGGCTGGGGTCAG^ 

ACAGAAGATTCAATAAGGGATTATGAAGATGGGATGGAGGTAGATACCACACCAACAGTTGCTGGACAGT 

TTGAGGATACAGATGTTGATGACTQAAAATAATTTATGCAG 

The disclosed nucleic acid N0V4 sequence has 620 of 71 1 bases (87%) identical to a 
1579 bp Canis familiaris chloride conductance regulatory mKNA (GENBANK-ID: 
CCCC|acc:X65450 (E= 5.4 e-1 14). In a search of sequence databases, it was also found that 
the nucleic acid sequence has 460 of 508 bases (90%) identical to a 1368 bp Homo sapiens 
chloride conductance regulatory mRNA (GENBANK-ID: HS510B21 (E=L2e-87). 

la a search of CuraGen's proprietary human expressed sequence assembly database, 

assembly s3aq:95074063 (1 860 nucleotides) was identified as having >95% homology to this 

predicted gene sequence (Table 4B). Ttds database is composed of the expressed sequences 

(as derived from isolated mRNA) from more than 96 different tissues. The mRNA is 
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converted to cDNA and then sequenced. These expressed DNA sequences are then pooled in 
a database and those exhibiting a defined level of homology are combined into a single 
assembly with a common consensus sequence. The consensus sequence is representative of 
all tnember components. Since the nucleic acid of the described invention has >95% sequence 
5 identity with the CuraGen assembly, the nucleic acid of the invention represents an e>cpressed 
gene sequence. This DNA assembly has 1200 components and was found by CuraGen to be 
expressed in the following tissues: colon, spleen, lung, small intestine, pancreas, heart, testis, 
fetal and adult kidney, fetal Uver, amygdala, adipose, pituitary gland, lymph node, lung tumor, 
and bone marrow. 



Table 4B. NOV4 aUgnment with S3aq95074063 (SEQ ID NO:48) 

S3aq: 95074063 Category D: 1200 frag {1 5'sig-CG, 1135 ' aig-CG, 57 

non-CG EST, 7 non-CG Non-EST) , 1860 bp* 

Plus Strand HSPs: Score ^ 832 {124.8 bits). Expect » 5*le-32, P = 5.1e-32 
Identities - 172/179 (96%), Positives « 172/179 (96%>, Strand - Plus / 
Plus 

N0V4: 552 ACyiTTGGAGWSATTAGAAGAAATGCTTTCTCAATCTGTGAGCAGCCAGTATAATATGGCT 621 

I niiiiiiMiiMiii niiiiiMiii iiiiiiiiiiiitiiiiiiiiiMitt 

S3aq: 1108 ATAta?GGAGAGATTAfiAAGGAATGCTTTCTCAGTCTGTGAGCAi^ 1167 
i?0V4! 622 GGGGTCJW^GAQVGAAGATTCAATAAGGGATTATGAAGATGGGATGGAGGTAGATACCAC^ 681 

1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M I n 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

S3aq; 11 68 GGGGTCAGGACAGAAGATTCAATAAGAGATTATGAAGATGGGATGGAGGTGGATACCACA 1227 

N0V4: 682 CCAACAGTTGCTGGACAGTTTGAGGATACAGATGTTGATCACTGAAAATAATTTATGCA 740 

IlitllMIIIIillilllllllllll lllllljlllllllllljllt lllllllfl 
33aq: 1228 CCAACAGTTGCTGGACAGTCTGAGGAXGCAGATGTTGATCACTGAAAATGATTTATGCA 1286 



The N0V4 polypeptide (SEQ ED N0:8) encoded by SEQ ID N0:7 is presented using 
the one-letter amino acid code in Table 4C. The Psort piojaie for N0V4 predicts that this 
sequence is likely to be localized at the plasma membrane with a certainty of 0.4500. 



Table 4C. N0V4 protein sequence (SEQ ID NO:8) 

MPNSBlSiPEPAEGHLQ^^ 

HLYAMVNDKrEESKESVADEEEEDSDDVELITEFlFVPSDKSALGAMFTAMCECQALH^ 

DVEAHERGKGDILKSYTYEGLSHL'TAEGQATLERLEEMLSQSVSSQYNl^GVRTEDSIRDYED^ 

GQFEDTDVDH 



The full amino acid sequence of the disclosed N0V4 polypeptide has 202 of 232 
amino acid residues (87%) identical to, and 207 of 232 residues (89%) positive with, the 237 
amino acid residue protein from Homo sapiens chloride channel (cWoride conductmice 
regulatory protein, chloride ion current inducer protein), ptnr:SPTREMBL-ACC:P54105, E = 
20 2.0e-^^). 

BLAST results include sequences from the Patp database, which is a proprietary 

database that contains sequences published in patents and patent publications. The Pa^ 

36 



wo 01/74851 PCT/USOl/10039 

results include those listed in Table 4D. See, e.g., European Patent 1033401, describing a 
human secreted protein. 



Table 4D. Patp alignments of NOV4 

sequences producing High-acdring Segment Pairs: Smallest 

Sum 

Reading High Prob . 
Frame Score P (N) 

Patp:G01583 Human secreted protein, . . . +1 Jli 5,4e-39 

Patp: GO 4 7 66 Arabldopsis thaliana protein fragment . +1 186 9.0e^l4 

Patp:G04767 Arabidopsis thaliana protein fragment . +1 186 9.0^-1^4 

Patp;G04768 Arabidopaia thaliana protein fragment . +1 148 1 . 3e-Q^3 



5 The disclosed N0V4 protein (SEQ ID N0:8) also has good identity with a number of 

chloride channel proteins. The identity information used for ClustalW analysis is presented in 
Table 4E. 



Table 4E. BLAST results for N0V4 


Gene Index/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi|4502891|ref |KP 00 
1284,11 (1717899^ 
CX91788), {U53454), 
{AF005422) , 
(AF026003) r 
(AF232224) . 


chloride channel, 
nucleotide^ 
sensitive, lA 
Homo sapiens 


237 


168/232 
(72%) 


174/232 
(74%) 


le"79 


gil8571386IgblAAF768 
59,1) (AF232225) 


chloride ion 
current inducer 
protein I (Cln) 

HQmo sapiens 


237 


167/232 
(71%) 


173/232 
(73%) 


4e-79 


gi 1 B571390 1 gb i AAF7 68 
61.11 <AF232709) 


chloride ion 
current inducer 
protein llCln) 

Homo sapiens 


237 


167/232 
(71%) 


173/232 
(73%) 


6e-79 


gill0954821prf 1 12109 
219A 


CI current- 
related protein 
Oryctolagus 
cuniculus 


236 


159/231 
(68%) 


165/231 
(70%) 


le^73 


gi 1 1060971 1 dbj lairtAGS 
069.11 (D26076) 


chloride channel 
Oryctolagus 


252 


159/231 
(68%) 


165/231 
(70%) 


le-73 



10 This information is presented graphically In the multiple sequence alignment given in 

Table 4F (with NOV4 being shown on line 1) as a ClustalW analysis comparing N0V4 with 
related chloride channel sequences. 
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Table 4F Information for the ClustalW proteins: 



1) NOV4(SEQIDNO:8) 

2) gi|4502891|re£]NP_001284.1 1 chloride channel, nuclcotide-sensitive, lA (SEQ IDNO:49) 

3) Ei|8371386|gb|AAF76859 J| (AF232225) chloride ion current inducer protein r(Cln) (SEQ © NO:50) 

4) gi|8571390|gb|AAF7686l.l| CAF232708) chloride ion current inducer protein I(Cln) (SEQ ID N0:51) 

5) gi|1095482(prf||21092l9A CI current-related protein (SEQ IDNO;52) 

6) ai|106O971|dbj|BAAO5O69.1| (D26076) chloride channel (SEQ ID NO:53) 



N0V4 

git 45028911 
gi|B5713B6| 
gi|BS71390| 
gi 1 10954821 
gl 1 1060971 1 



M0V4 

git 45028911 
git 6571386 1 
git 8571390 1 
gi 1 1095482 1 
git 10609711 



M0V4 

91145028911 
gi 1 8571386 1 
gi I 8571390 1 
gi 1 1095482 1 
git 1060971] 



»0V4 

gl I 4502891 1 
gi 1 657138 6 1 
glj 8571390 t 
gli 1095482 t 
git 10609711 



H0V4 

git 4502891 1 
gi 1 8571386 1 
git 8571390 i 
git 1095482 1 
git 10609711 




250 




The similarity between the disclosed N0V4 and a number of chloride conductance 
proteins suggests that NOV4 may function as a member of a chloride conductance regulatoiy- 
like protein. 

Transporters, channels, and pumps that reside in cell membranes are key to 
maintaining the right balance of ions in cells, and are vital for trmismitting signals Scorn nerves 
to tissues. The consequences of defects in ion channels and transporters are diverse, 
depending on where they are located and what their cargo is, In the heart, defects in potassium 
channels do not allow proper transmission of electrical impulses, resulting in the arrhythmia 
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seen in long QT syndrome. In the lungs, failure of a sodium and cMoride transporter found in 
epithelial cells leads to the congestion of cystic flbrosis, while one of the most common 
inherited forms of deafiiess, Pendred syndrome, looks to be associated with a defect in a 
sulfiate transporter. Chloride channels in the ocular ciliary epithelium are believed to play a 
5 key role in aqueous hiunor formation. Anguita et al., Biochem Biophys Res Commun. 
208:89-95, 1995. 

Chloride channels (CLC) perform important roles in the regulation of cellular 
excitability, in transepithelial transport, cell volume regulation, and acidification of 
intracellular organelles. This variety of functions requires a large number of different chloride 

10 channels that are encoded by genes belonging to several unrelated gene families. The CLC 
family of chloride channels has nine known members in mammals that show a differential 
tissue distribution and function both in plasma membranes and in intracellular organelles. 
CLC pix)teins have about 10-12 transmembrane domains. Thejf probably function as dimers 
and may have two pores. The functional expression of channels altered by site-directed 
, 15 mutagenesis has led to important insights into flieir structure-function relationship. Their 

physiological relevance is obvious from three human inherited diseases (myotonia congenita. 
Dent's disease, and Bartter's syndrome) that result from mutations in some of their members 
and from a knock-out mouse model Jentsch et al, Pflugers Arch 437:783-795, 1999, 

Recent studies of hereditary renal tubular disorders have facilitated the identification 

20 and roles of chloride channels and co-transporters in the regulation of the most abundant 

anion, C1-, in the ECF. Thus, mutations that result in a loss of function of the voltage-gated 
chloride channel, CLC-5, axe associated with Dent's disease, which is characterized by low- 
molecular weight proteinuria, hypercalciuria, nephrolithiasis, and renal failure. Mutations of 
another voltage-gated chloride channel, CLC-Kb, are associated with a form of Bartter's 

25 syndrome, whereas other fonns of Bartter's syndrome are caused by mutations in the 

bxmietanide-sensitive sodium-potassium-chloride cotiansporter (NKCC2) and the potassiimi 
chaimel, ROMK, Finally, mutations of the thiazidc-sensitive sodium-chloride cotransporter 
(NCCT) are associated with Gitelman*s syndrome. Thakker, Adv Nephrol Necker Hosp 
29:289-298, 1999. These studies have helped to elucidate some of the renal tubular 

30 mechanisms regulating mineral homeostasis and the role of chloride channels. 

A more prominent case of chloride channel dysfunction is cystic fibrosis. Cystic 
fibrosis (CF) is a genetic disease with multi-system involvement in which defective chloride 
transport across membranes causes dehydrated secretions. Cystic fibrosis (CF) affects 
approximately 1 in 2000 people making it one of the commonest fatal, inherited diseases in the 
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Caucasian population, Dysfiinction of the cystic fibrosis transmembrane conductance 
regulator (CFTR) CI- channel is also associated with a wide spectrum of disease. Hwang & 
Sheppard, Trends Pharmacol Sci 20:448-453, 1999, The protein encoded by the CF gene--the 
cystic fibrosis transmembrane conductance regulator (CFTR)"jfunctions as a cyclic adenosine 
5 monophosphate-regulated chloride channel The abihty to detect CFTR mutations has led to 
the recognition of its association with a variety of conditions, including chronic bronchitis, 
sinusitis mth nasal polyps, pancreatitis, and, in men, infertility. Choudari et al., Gastroenterol 
Chn North Am, 28;543"549, vii-viii, 1999. hi the search for modulators of CFTR, 
pharmacological agents that interact directly with the CFTR Cl-channel have been identified, 

10 Some agents stimulate CFTR by interacting with the nucleotide-binding domains that control 
channel gating, whereas others inhibit CFTR by binding within the channel pore and 
preventing CI- permeation. Knowledge of the molecular pharmacology of CFTR might lead 
to new treatm^ts for diseases caused by the dysfunction of CFltR, Chloride channels may 
participate in cellular volume control by activation of a swelling-induced chloride conductance 

15 pathway. 

The nucleic acids and proteins of N0V4 are useful in potential therapeutic ^pUcations 
impUcatcd in various chloride channel-related pathological disorders. For example, a cDNA 
encoding the chloride channel -like protein may be useful in gesne thorapy, and the chloride 
channel -like protein may be useful when administered to a subject in need thereof The 

20 protein similarity information, expression pattern, and location for the chloride channel - 
like protein and nucleic acid disclosed herein suggest that this chloride channeLmay have 
important structural and/pr physiological functions characteristic of the chloride chaimel 
family. Therefore, the nucleic acids and proteins of the invention are useful in potential 
diagnostic and therapeutic applications and as a research tool. These include serving as a 

25 specific or selective nucleic acid or protein diagnostic and/or prognostic marker, wherdn the 
presence or amount of the nucleic acid or the protein are to be assessed, as well as potential 
therapeutic applications such as the following: (i) a protein ther^eutic, (ii) a small molecule 
drug target, (iii) an antibody target (therapeutic, diagnostic, drug targeting/cytotoxic antibody), 
(iv) a nucleic acid useful in gene therapy (gene deUvery/gcne ablation), and (v) a composition 

30 promoting tissue regeneration in vitro and in vivo (vi) biological defense weapon. 

The nucleic acids and proteins of the invention are useful in potentid diagnostic and 
therapeutic applications impUcated in various diseases and disorders described below. For 
example, the nucleic acids and proteins of the invention are useful in potential therapeutic 
apphcations implicated in cystic fibrosis, congenital myotonia. Dent disease, an X-linked reixal 
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tubular disorder, leukoencephalopathy, malignant hyperthermia, and hypertension. For 
example, a cDNA encoding the chloride conductance regulatory -like protein may be useful in 
gene therapy, and the chloride conductance regulatory -like protein may be useful when 
administered to a subject in need thereof* 
5 The N0V4 compositions of the present invention will have efficacy for treatment of 

patients suffering from, for example, cystic fibrosis, congenital myotonia, Dent disease, an X- 
linked renal tubular disord^, leukoencephalopathy, maUgnant hyperthermia, hypertension. 
Other pathologies and disorders are contemplated. 

The novel nucleic acid encoding a chloride conductance regulatory -like protein, and 

10 the chloride conductance regulatory -like protein of the invention, or fragments thereof, may 
further be useful in diagnostic applications, wherein the presence or amount of the nucleic acid 
or the protein are to be assessed* These materials are further useful in the generation of 
antibodies that bind immunospecifically to the novel substancesi of the invention for use in 
therapeutic or diagnostic methods and other diseases, disorders and conditions of the like. 

1 5 These materials are further useful in the generation of antibodies that bind inmiunospecifically 
to the novel substances of the invention for use in therapeutic or diagnostic methods* These 
antibodies may be generated according to methods known in the art, using prediction from 
hydrophobicity charts, as described in the "Anti-NOVX Antibodies" section below. 

For example, the disclosed N0V4 protem has multiple hydrophilic regions, each of 

20 which can be used as an immunogen. In one embodiment, a contemplated N0V4 qpitope is 
from about amino acids 5 to 25* Iti,auotiier embodiment, a N0V4 epitope is from about amino 
acids 65 to 105. In additional embodiments, N0V4 epitopes are from amino acids 125 to 230. 
These novel proteins can also be used to develop assay system for functional analysis. 

NOVS 

25 NOVS includes a family of two similar nucleic acids and two similar proteins disclosed 

below. The disclosed nucleic acids encode serotonin receptor-like proteins. The Serotonin 
Receptor-like gene disclosed in this invention maps to chromosome 2. This assignment was 
made using mapping information associated with genomic clones, public genes and ESTs 
sharing sequence identity with the disclosed sequence and CuraGen Corporation*s Electronic 

30 NorthOTi bioinformatic tool. 
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NOVSa 

The disclosed NOVSa nucleic acid was identified by TblastN using CuraGen 
Corporation's sequence file for the 5-hydroxytiyptamine receptor-like protein or homolog as 
5 run against the Genomic Daily Files made available by GenBank or from files downloaded 
firom the individual sequencing centers. The nucleic acid sequence was predicted fi"om the 
genomic file Seq Ctr ACCNO: nh0028h22 by homology to a known 5-hydroxytryptamine 
receptor or homolog. Exons were predicted by homology and the intron/exon boundaries were 
determined using standard genetic rules. Exons were fiirther selected and refined by means of 

1 0 similarity determination using multiple BLAST (for example, tBlastN, BlastX, and BlastN) 
searches^ and, in some instanceSj GenScan and Grail. Expressed sequences firom both public 
and proprietary databases were also added when available to further define and complete the 
gene sequence. The DNA sequence was then manually correctid for apparent inconsistencies 
thereby obtaining the sequences encoding the full-length protein. 

1 5 The disclosed NOVSa nucleic acid of U 50 nucleotides (also refenred to as 

GM_83554525_A, or CG54692-01) is shown in Table 5 A An ORF begins with an ATG 
initiation codon at nucleotides 24-26 and ends with a TGA codon at nucleotides 1 134-1 136, A 
putative untranslated region upstream fi^om the initiation codon and downstream &om the 
termination codon is underlined in Table 5 A, and the start and stop codons are in bold letters. 

20 



Table SA; ffOVSa Nucleotide Sequence (SEQ ID NO:9) 



CTGGaGCTGCGATCCCAAGCGCGA TGGAGGCCGCTAGCCTTTCAGTGGCCACCGCCGGCGTTGCCCTTG 

CCCTGGGACCCGAGACCAGCAGCGGGACCCCAAGCCCGAGAGGGATACTCGGTTCGACCCCGAGCGGCG 

CCGTCCTGCCGGGCGGAGGGCCQCCCTTCTCTGTCTTCACGGTCCTGGTGGTGACGCTGCTAGTGCTGC 

TGATCGCTGCCACTTTCCTGTGGAACCTGCTGGTTCCGGTCIACCATCCCGCGGGTCCGTGCCTTCCACC 

GCGTGCCGCATAACTT6GTGGCCTCGACGGCCGTCTCGGACGAACTAGTGGCAGCGCTGGCGATGCCAC 

CGAGCCTGGCGAGTGAGCTGTCGACCGGGCGACGTCGGCTGCTGGGCCGGAGCCTGTOCCACGTGTGGA 

TCTCCTTCGACGCCCTGTGCTGCCCCGCCG6CCTCGGGAACGTGGCGGCCATCGCCCTGGGCCGCGACG 

GGGCCATCACACGGCACCTGCAGCACACGCTGCGCACCCGCAGCCGCGCCTCGTTGCTCATGATCGCGC 

TCGCCCGGGTGCCGTCGGCGCTCATCGCCCTCGCGCCGCTGCTCTTTG6CCGGGGCGAGGTGTGCGACG 

CTCGGCTCCAGCGCTGCCAGGTGAGCCGGGAACCCTCCTATGCCGCCTTCTCCACCCGCGGCGCCTTCC 

ACCTGCCGCTTGGCGTGGTGCCGTTTGTCTACCGGAAGATCTACGAGGCGGCCAAGTTTCGTTTCGGCC 

GCCGCCGGAGAGCTGTGCTGCCGTTGCCGGCCACCTCCAAGGTAAAGGAAGCACCTGATGAGGCTGAAG 

TGGTGTTCACGGCACATTGCAAAGCAACGGTGTCCTTCCAGGTGAGCGGGGAGTCCTGGCGGGAGCAGA 

AGGAGAGGCGAGCAGCCATGATGGTGGGAATTCTGATTGGCGTGTTTGTGCTGTGCTGGATCCCCTTCT 

TCCTGACGGAACTCATCAGCCCACTCTGTGCCTGCAGCCTGCCCCCCATCTGGAAAAGCATATTTCTGT 

GGCTTGGCTACTCCAATTCTTTCTTCAACCCCCTGATTTACACAGCTTTTAACAAG 

CCTTCMGAGCCTCTTTACTAAGCAGAGATCAACACAGGGGTTAGA ' 



The NOVSa protein encoded by SEQ ID N0:9 has 370 amino acid residues and is 

presented using the one-lotter code in Table 5B. The Psort profile for NOVSa predicts that 

this sequence has a signal peptide and is likely to be localized at the endoplasmic reticulum 

25 membrane with a certainty of 0.6850, it may also localize to the plasma membrane (certaiiity 
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of 0.6400). The most likely cleavage site for a peptide is between amino acids 24 and 25, 
at the slash In the amino acid sequence SSG-TP (shown as a slash in TableSB) based on the 
Signal? result. 



Table 5B, Encoded NOV5a protein sequence (SEQ ID NO.IO) 



MEAASLSVATAGVAIJ^GPETSSG/TPSPRGILGSTPSGAVLPGRGPPFSVFTVLVVTIiLVLLIAATFLWNL 

LVPVTIPRVRAFHRVPHNLVASTAVSDELVAALAMPPSIASELSTGRRRLLGRSLCHVWISFDALCCPA^ 

NVAAIALGRDGAITRHLQHTLRTRSRASLLMIALARVPSALIALAPLLFGRGBVCDARLQRCQVSREPSYAA 

FSTRGAFHLPLGWPFVYRKIYEAAKFRFGRRRRAVLPLPATSBCVKEAPDEAEWFTAHCKATVSFQVS 

WREQKERRAAMMVGILIGVFVIX:WIPFBl)TELJ3PLCACSLPPIWK3IFLWIiGySNSFFNPI*IYTAFN^^ 

KAFKSLFTKQR , L_ 



The disclosed nucleic acid sequence for N0V5a has 990 of 1230 bases (80 %) 
identical to a Mus musculus, 5-hydroxytryptamine receptor mSNA (GENBANK-ID; X69867) 
(E==Lle-167). Additionally, high homology with a portion of toe protein of the invention is 
found with two nucleic acid sequences coding for 335 of 336 bales (99%) identical to a part of 

10 a 2061 bp Homo sapiens S-hydroxytryptamine receptor gene (GENBANK-ID:A39680 

Sequence 3 &om Patent W094183 19, 6.4e-69) and also 1 17 of 1 17 bases (100%) identical 
to a 371 bp Homo sapiens expressed sequence tag (EST) (GENBANK-ID:A39680: 
Scares Jestis_NHT Homo sapiens cDNA clone, IMAGE:1641069, E=2.8e-20). This 95- 
100% homology of the gene of current invention with a public EST sequence strongjy 

1 5 suggests that the current invention represents an expressed gene. 

The full NOV5a amino acid sequence of the protein of the mvention was found to have 
295 of 370 amino acid residues (79 %) identical to, and 317 of 370 residues (85 %) positive 
with, the 370 amino acid residue 5-hydroxytryptamine receptor protein from Rattus 
norvegicus (ptnr:SPTREMBL-ACCLP35365) (E- and also, 225 of 348 amino acid 

20 residues (64 %) identical to, and 261 of 348 residues (75 %) positive with, the 357 amino acid 
residue 5-hydroxytryptamine receptor protein from Homo sapiens (ptnr:SWISSPROT- 
ACC:P47898) (E- A.5q^^% 



NOVSb 

25 N0V5a <GM_83554525__A) was subjected to an axon linking process to confirm the 

sequence. PGR primers were designed by starting at the most upstream sequence available, 
for the forwaid primer, and at the most downstream sequence available for flie reverse primer. 
In each case, the sequence was examined, walkmg inward from the respective temiini toward 
the coding sequence, until a suitable sequence that is either unique or highly selective was 

30 encountered, or, in the case of the reverse primer, until the stop codon was reached. Such 
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suitable sequences were then employed as the forward and reverse primers in a PGR 
amplification based on a wide range of cDNA libraries. 

The cDNA coding for the N0V5b sequence was cloned by the polymerase chain 
reaction (PGR) using the primers: 5' CATGGAGGCCGCTAGCCTTT 3' (SEQ ID NO;54) and 
5 5^ CCCTGTGTTCATCTCTGCTTAGTAAAGAG 3* (SEQ ID NO:55). Primers were 

designed based on in silico predictions of the full length or some portion (one or more exons) 
of the oDNA/protein sequence of the invention. These primers were used to amplify a cDNA 
from a pool containing expressed human sequences derived from the following tisstues: 
adrenal gland, bone marrow, brain - amygdala, brain - cerebellum, brain - hippocampus, brain 

10 - substantia nigra, brain - thalamus, brain -whole, fetal brain, fetal kidney, fetal liver, fetal 
lung, heart, kidney, lymphoma - Raji, mammary gland, pancreas, pituitary gland, placenta, 
prostate, salivary gland, skeletal muscle, small intestine, spinal cord, spleen, stomach, testis, 
thyroid, trachea and uterus. i 

Multiple clones were sequenced and these fragments were assembled together, 

15 sometimes including public human sequences, using bioinformatic programs to produce a 
consensus sequence for each assembly. Each assembly is mcluded in CuraGen Corporation's 
database. Sequences were included as components for assembly when the extent of identity 
with another component was at least 95% over 50 bp. Each assembly represents a gene or 
portion thereof and includes information on variants, such as sphce forms single nucleotide 

20 polymorphisms (SNPs), insertions, deletions and other sequence variations. 

^ V Variant sequences are also included in this application, A variant sequence can include 
a single nucleotide polymorphism (SNP). A SNP can, in some instances, be referred to as a 
"cSNP" to denote that the nucleotide sequence containing the SNP originates as a cDNA. A 
SNP can arise in several ways. For example, a SNP may be due to a substitution of one 

25 nucleotide for another at the polymorphic site. Such a substitution can be either a tratisition or 
a transversion. A SNP can also arise from a deletion of a nucleotide or an insertion of a 
nucleotide, relative to a refetmoe allele. In this case, the polymorphic site is a site at which 
one allele bears a gap with respect to a particular nucleotide in another allele. SNPs occurring 
within genes may result in an alteration of the amino acid encoded by the gene at the position 

30 of the SNP* Intragenic SNPs may also be silent, when a codon including a SNP encodes the 
same amino acid as a result of the redundancy of the genetic code. SNPs occurring outside the 
region of a gene, or in an intron withm a gene, do not result in changes in any amino acid 
sequence of a protein but may result in altered regulation of the expression pattern. Examples 
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include alteration in temporal expression, physiological response regulation, cell type 
expression regulation, intensity of expression, and stability of transcribed message, 

SeqCalling assemblies produced by the exon linking process were selected and 
extended using the following criteria. Genomic clones having regions with 98% identity to all 

5 or part of the initio or extended sequence were identified by BLASTN searches using the 

relevant sequence to query human genomic databases. The genomic clones that resulted were 
selected for further analysis because this identity indicates that these clones contain the 
genomic locus for these SeqCalling assemblies. These sequences were analyzed for putative 
coding regions as well as for similarity to the known DNA and protein sequences. Programs 

1 0 used for these analyses include Grail, Genscan, BLAST, HMMER, FASTA, Hybrid and other 
relevant programs. 

Some additional genomic regions may have also been identified because selected 
SeqCalling assembUes map to those regions. Such SeqCalling Sequences may have 
overl^ped with regions defined by homology or exon predictioa They may also be included 

15 because the location of the firagment was in the vicinity of genomic regions identified by 
similarity or exon prediction that had been included in the original predicted sequence. The 
sequence so identified was manually assembled and then may have been extended using one 
or more additional sequences taken fiom CuraGen Corporation's human SeqCalling database. 
SeqCalling fi:^gments suitable for inclusion were identified by the CuraTools™ program 

20 SeqExtend or by identifying SeqCalling fragments mapping to the appropriate regions of the 
genomic clones ai^lyzed. Such sequences were included in the derivation of N0V5bv{Acp. 
No. CG54692-02) only when the extent of identity in the overlap region with one or more 
SeqCdling assemblies 145286067 was high. The extent of identity may be, for example, 
about 90% or higher, preferably about 95% or higher, and even more preferably close to or 

25 equal to 100%. When necessary^ the process to identify and analyze SeqCalling fragments 
and genomic clones was reiterated to derive the full length sequence. 

The regions defined by the procedures described above were then manually integrated 
and corrected for apparsit mconsistencies that may have arisen, for example, from miscalled 
bases in the original firagments or from discrepancies between predicted exon junctions, EST 

30 locations and regions of sequence similarity, to derive the final sequence disclosed herein. 
When necessary, the process to identify and analyze SeqCalling assembUes and genomic 
clones was reiterated to derive the full length sequence. The following pubhc components 
were thus included in the invention: gb:GENBAlSIK-ID:AC009404|acc:AC009404,5 Homo 
sapiens BAC clone RPl 1-28H22 from 2,complete sequence - Homo sapiens, 1 12883 bp. In 
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addition, the following CuraGen Corporation SeqCalKng Assembly ID's were also included in 
the invention: 145286067, 

The resulting amplicon was gel purified, cloned and sequenced to high redundancy to 
provide N0V5b (SEQ ID NO:! 1), which is also referred to as CuraGeoa Acc. No. CG54692- 
02. 

The nucleotide sequence for NOV5b (1150 bp, SEQ ID NO: 1 1) is presented in Table 
5C. An open reading frame was identified beginning at nucleotides 24-26 and ending at 
nucleotides 1 134-1 136. The start and stop codons of the open reading frame are highlighted in 
bold type, and putative untranslated regions are underlined* The nucleotide sequence of 
NOVSb differs from NOV5a by six nucleotide changes: T709 >C; T795>A; C796>T; 
C797>G; A798>C; G800>A. 



Table 5C. NOVSb Nucleotide Sequence (S^Q ID NO:ll) 



CTGGAGCT6CGATCCCAAGCGCCA TGGAGGCCGCTAQCCTTTCAGTGGCCRCCGCCGGCG 60 

TTGCCCTTGCCCTGGGACCCGA6ACCAGCAGCGGGACCCCAAGCCCGA6AGGGATACTCG 120 

GTTCGACCCCGAGCGGCGCCGTCCTGCCGGGCCGAjGGGCCGCCCTTCTCTGTCTTCACGG 180 

TCCTGGTGGTGACGCTGCTAGtGCTGCTGATCGCTGGCACTTTCCTGTGGAACCTGCTGG 240 

TTCCGGTCACCATCCCGCGGGTCCGTGCCTTCCACCGCGTGCCaCATAACTTGGTGGCCT 300 

CGACGGCCGTCTCGGACGAACTAGTGGCAGCGCTGGCGATGCCACCGAGCCTGGCGAGTG 360 

AGCTGTCGACCGGGCGACGTCGGCTGCTGGGCCGGAGCCTGTGCCACGTGTGGATCTCCT 420 

TCGACGCCCTGTGCTGCCCCGCCGGCCTCGGGAACGTGGCGGCCATCGCCCTGGGCCGCG 480 

ACGGGGCCATCACACGGCACCTGCAGCACACGCTGCGCACCCGCAGCCGCGCCTCGTTGC 540 

TCATGATCGCGCTCGCCCGGGTGCCGTCGGCGCTCATCGCCCTCGCGCCGCTGCTCTTTG 60O 

GCCGGGGCGAGGTGTGCGACGCTCGGCTCCAGCGCTGCCAGGTGAGCCGGGAACCCTCCT 660 

ATGCCGCCTTCTCCACCCGCGGCGCCTTCGACCTGCCGCTTGGCGTGGCGCCGTTTGTCT 720 

ACCGGAAGATCTACGAGGCGGCCaUiGTTTCGTTTCGGCCGCCGCCGGAGAGCTGTGCTGC 780 

CGTTGCCGGCCACCATGCAAGTAAAGGAAGCACCTGATGAGGCXGAAGTGGTGTTCACGG 840 

CACATTGCAAAGCAACGGTGTCCTTCCa^GGTGAGCGGGGACTCCTGGCGGGAGCAGAAG 900 

AGAGGCGAGCAGCCATGATGGTGGGAATTCTGATTGGCGTGTTTGTGCTGTGGTGGATCC 960 

CCTTCTTCCTGACGGAACTCATCAGCCCACTCTGTGCCTGCAGCCTGCCCCCCATCTGGA 1020 

AAAGCATATTTCTGTGGCTTGGCTACTCCAATTCTTTCTTCAACCCCCTGATTTACACAG 1090 

CrrTTAACAAGAACTACAACAATGCCTTCAAGAGCCTCTTTACTAAGC^^ 1140 

AGGGGTTAGA 1150 ^ 



In a search of sequence databases, it was found, for example, that the NOVSb nucleic 
acid sequence has 920 of 1123 bases (81%) identical to a serotonin receptor noRNA from Mus 
musculus (gb:GENBANK-ro:MM5HT5BSR|acc:X69867.1, M.musculus mRNA encoding 5- 
HT5B serotonin receptor, E= h9e-163). 

The encoded NOVSb protein is presented in Table 5D. The disclosed protean is 370 
amino acids long and is denoted by SEQ ID NO: 12. NOVSb differs from NOVSa by 3 amino 
acid residues: V229>A; S258>M; K259>Q, 

Like NOVSa, the Psort profile for NOVSb predicts that this sequence has a signal 
peptide and is likely to be localized at the endoplasmic reticulum membrane with a certainty of 
0.68S0, or at the plasma membrane, \^dth a certainty of 0.64OO. The most likely cleavage site 
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for a peptide is between amino acids 24 and 25, /.e., at the slash in the amino acid sequence 
SSG-TP (shown as a slash in TableSD) based on the Signal? result. 



Table 5D, Encoded NOVSb protein sequence (SEQ ID NO:12) 

MEAA3LSVATAGVALALGPETS3G/TPSPRGILGSTPSGAVLPGRGPPFSVFTVLVVTLLV 60 
LLIAATFLWNLLVPVTIPRVRAFHRVPHNLVASTAVSDELVAALAMPPSLA3ELSTGRRR 120 
LLGRS LCHVWI S FDALCCPAGLGNVAAIALGRDGAITRHLQHTLRTRSRASLLMIALAEV 180 
PSALIALAPLLFGRGBVCDARLQRCQVSREPSYAAFSTRGAFHLPLGVAPFVYRKIYEAA 240 
KFRFGRRRRAVLPLPATMQVKEAPDEAEWFTAHCKAWSFQV3GDSWREQKERRAAMMV 300 
GirilGVFVLCWIPFFLTELISPLCAGSLPPIWKSIFLWLGYSNSFFNPLIYTAFtnCN^^ 360 ' 
AFKSLFTKQR 370 . 

5 

The full amino acid sequence of the NOVSb protein was foimd to have 295 of 370 
amino acid residues (79%) identical to, and 315 of 370 amino acid residues (85%) similar to, 
the 370 amino acid residue serotonin receptor protein from Ratios norvegicus 
(ptnr:SWISSPROT-ACC:P35365, 5-HYDROXYTRYPTAMINE 5B RECEPTOR (5-HT-5B), 
10 SEROTONIN RECEPTOR (MR22), E^ 6.8e452), 

Patp results include those listed m Table 5E. 



Table 5E. Patp alignments of NOVSa 



Sequences producing High-scoring Segment Pairs: Smallest 

Sum 

Reading High Pxob 
Frame Score P(N) 

Patp:R58686 Rat MR22 serotonin receptor protein - . . . +3 I486 1. 6e-151 

Patp:R57066 Murine aerotoninergic receptor 5HT5b - ... +3 1485 2,06-151 

Patp:R45848 Human 5HT5a serotonin receptor - ... +3 1046 6,7e-105 

Patp:R45847 Murine 5HT5a serotonin receptor - ... +3 1041 2,3e-104 

Patp:R58685 Rat REC17 serotonin receptor protein - ... 4-3 1036 4.7e-104 

Patp:R57067 Human aerotoninergic receptor SHTSb - ... 43 596 3.2e-57 



For example, a BLAST against R58686, a 370 amino acid serotonin receptor from 
15 Rattus rattus, produced 295/370 (79%) identity, and 317/370 (85%) positives (E = l,6e-151), 
with long segments of amino acid identity, as shown in Table 5F. WO 94/21670. A blast 
against R57066, a 370 amino acid murine serotoninergic receptor (SHT5b) from Mus 
musculus produced 297/370 (80%) idratity, and 318/370 (85%) positives (E = 2.0e-151). WO 
94/18319* Additionally, amino acids 260 -320 from N0V5 were found to be identical with a 
20 111 amino acid human serotonergic receptor (E= 3.2e-57). WO 94/1 83 19. 

Unless specifically addressed as NOVSa or NOVSb any reference to N0V5 is assumed 
to encompass all variants. Residue differences between any NOVX variant sequences herein 
are written to show the residue in the "a" variant and the residue position with respect to the 
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* V variant. NOV residues in all following sequence alignments that differ between the 
individual NOV variants are highlighted with a box and marked with the (o) symbol abovo the 
variant residue in all alignments herein. For example, the protein shown m line 1 of Table 5F 
depicts the sequence for NOVSa, and the positions where NOVSb differs are marked with a 
(o) symbol and are highlighted with a box. Both NOV5 proteins have significant homology to 
serotonin receptor (SR) proteins: 



Table 5F. NOV5 alignment with R58686 (SEQ ID NO:56) 



Score " 1466 (523. X bits), Expect « l,6e-15X, P = 1.6e-151 

95/370 (79%), Positives ^ 317/370 (85%), Frame = +3 



11 ++I1 II 1+) iiK I +11 +iiii I +1 II nil iiiiiiiiii 

MEV3NLSGATPGIAFFP0PESC3DSPSSGRSMGSTPGGLILS0REPPFSAFTVLVVTLLV 60 
LLIAATE^WNLLVPVTIPRVW^FHRVPHNIiVASTAVSDELVAAI^PPSLASBLSTGRRR 120 

llIlllllDll) 111 lllllllllllltllMIII iUIMMI II III! HI 

LliIAM*FLWNXiI,VX,VTILRVRAFHRVPHNLVA5TAVSDVl.VAALVM^ 1)^0 
LLGRSLCHWISFDALCCPAGLGNVAAlALGRDGATTRHLQHTIJRTRSiy^LLMlAI^ 180 

liiiliiilliii 111 I + imill ! iljlll+lllll ill 1111+ 

QIiGRSLCHVWISFDVLCCTASIWNVRAIALDRYWTITRHLQrriiRTRRRASALMIAITWA 180 

o 

PSALIAIAPLLFGRGEVCnARLQRCQVSREPSYAAFSTRty^HLPLGVgPFVYRKlY 240 

llllllllllll 11 llllllllll + lllll HI II1++H tl HI IH + Il 
LSALIAIAPLLFGWGEAYDARLQRCQVSOEPSYAVFSTCGArm'IiAVVIiFVYWKIYKAA 240 

Oo 

KHlFGRRRRAVLPLPAT^VKEAPDEyVEWFTAHCKATVSFQVSGDSWREQKERRAAIyW 300 

iiiiiiiitii-Hiiiif+ nil 1+1 nil i+ni+H ninnin+iiini 

KFRFGRRRRAVVPLPATTQAKEAPQESETVFTARCRATVAFgrPSGDSWREQKEKRAAMMV 300 
GILIGVFVLCWIPFFLTEILIfiPLCACSLPPIWKSlFLWLGYSNSFEljJPLrYTAFNKNY^^ 360 

niniininnnin+iininnHnniiinniiHiHiiiiiniiii 

GILIGVFVLCWIPFFLTELVSPLGACSLPPIWKSIFLWLGYSKSFFNPLIYTAFNKNYNN 360 
ArKSXiFTKQR 370 

niiniiii 



Identities 


N0V5: 


a 


R58686: 


1 


N0V5: 


61 


R58686; 


61 


N0V5: 


121 


R586B6: 


121 


N0V5: 


181 


R58686: 


181 


NQV5: 


241 


R58686: 


241 


N0V5: 


301 


R58686: 


301 


NOVS: 


361 


R586B6: 


361 
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The disclosed NOV5 protein has good identity with a number of serotonin receptor 
proteins. The identity information used for ClustalW analysis is presented in Table 5G, 



Table 5G, BLAST results for NOV5 


Qene Index/ 


Protein/ Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gii 543730 i3p!P35365 
i 5H5B_RAT 


5- 

HYDROXYTRYPTAMINE 
5B RECEPTOR (5- 

HT-5B) (SEROTONIN 
RECEPTOR) {MR22) 

Rattua norvegicua 


370 


265/370 
(71%) 


287/370 
(76%) 


e-134 


gi|6754260!ref INP_0 
34613.11 


5- 

hydroxytiryptaiTiine 
(serotonin) 
receptor 5B 
Mus musculus 


370 


267/370 
(72%) 


299/370 
(77%) 


e-134 


gil543453|pirl |S387 
44 


serotonin 
receptor 5B 
Rattus norvegicus 


369 


265/370 
(71%) 


287/370 
(76%) 


e-133 


gill3236497|ref |NP 
Q76917.ll 


5- 

hydr oxyt rypt ami ne 
(serotonin) 
receptor 5A 
Homo sapiens 


357 


204/^49 
(59%) 


237/349 
(67%) 


2e-97 



5 This information is presented graphically in the multiple sequence alignment given in 

Table 5H (with N0V5a being shown on line 1) as a ClustalW analysis comparing NOV5 with 
related protein sequences. 



Table 5H Informatiou for the ClustilW proteins; 

10 

1) N0V5a (SEQIDNO:10) 

2) N0V5b (SEQIDN0:12) 

3) gi|310075|gb|AAA40616.l|(L10073)serotomnreceptorpUttusnomgi^^^ (SBQIDNO:57) 

4) gi|6754260|rcf|NP_034613,l| S-hydroxytiyptamine (serotonin) receptor 5B [Mus musculus] (SEQ ID NO:58) 
15 5) gi|543453lpirllS38744 serotonin receptor 5B - rat (SEQ ID NO:59) 

6) gi(13236497|ref|NP 0769I7.1| 5-hydfoxytryptamine(sffl*otonm)rBceptor5A [H. sapitms] (SEQIDNO:60) 



N0V5 



KOVSb 



MFVl^^rlT ^G'iTPGQAFP?CPESCS[;SFS3GRSi-:G3TFGGLII 2-^?^t:P: 

L^^^iv::;^JL:^GAT!^^:2Al■M:;~(:;^■Ml;^;(;^l)^;l;:•^ 



rf;V FT V.LVV'!'1.LV| 
-nVFTVT,VVTr.T,v| 
rSAFTVLVVTLiLVl 



K0V5 



KOV5b 



70 



80 



90 



Qi|310Q751 
Gil 67542601 
Gil 543453 1 
Oil 13236497 1 



100 
. I . .. . I . . 



..'i,VT l!.L; V;;;-J'':iK.V r^' : ::l b'/:^.S "I^- 
V L VT I L i?. V p?. r: i R V ? ; : : 1 L V A s i' 



110 

1 I . 



120 
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130 



140 



150 



160 



170 



180 



' I • 



. 1 . 



' I < 



. 1 , 



- 1 • 



N0V5 



NOVSb 



1_ 



Gil 310075 1_ 



Gil 6754260 1 



Gil 543453 1 



OLCRy. r A-;HV WI S FD VLCf ;T A:^; I WN VAAX a DR Y WT I'V i< H T.O YT LET R3 t^.\SALM i T^Vi, 

a I -GR s T .CI 1 v^i^j T r> Fn V r.cc t r, t m] v a at a ' , dr "-^ wt i, t k h t n yt r, P/r f ^ r a s a t ,h t ^ 



Gil 13236497 I 



190 


200 


210 


220 


230 


240 





ElEEffiSIMI ' ''^ ^ -^-^^ L Lr ■{JlVGlvAY IjA^LCRC Q vSgE ? S Y AVL- 3 TCG A F YL ? LAW A FV Yl^ V. I YKA; 



Gi 1 13236497 1 



250 



260 



270 



280 



290 



300 



|5g2a>B||^Hir;t'RFGQnRF,riJA^2:rArATg2vB^Ei^ 

TRI^rjRRP ?.AVV?T,P AY'TQAKFAPgR Fjj'/FTAKC y^ATV^F^TS G DSWKZ ;3KF KrAAAt.]!-: 



Gil 13236497 I 



310 


320 


330 


340 


350 


360 





■ V L .: I F F F ir f a r s p l c a s l p p i w k ^ j h' t . w i . g 

"VL::WY?FFL':FLI3PI.CA:r^T,P?TWK::l KlA'Jr.f;' 
V A. A'J " F 1:' 1 / " F ■ .0 A P I . A A . A"^ L F F' T H i A^ K I I ■( A 

■ VLCW:FFFLrELI3PLCAJ3LPFI''jKSI~LWLG' 
■Vi.CVi:L'FFiYrFL2A l.MAAlASLPPIViKiiiYLWLG^ 



^M:^ FFNPLIYTAFMfUlYP 
^^;:FFI'IL^l.iYTAF^iKtlY^; 
■;N FFMPi., .[. Y T A !;'[>! K N V [■: 
:;i]i:FFtJPLXYTAFNKnYK 
JNiLb'FtlPi i Y'i'AFKKlIYi; 
AFMPIIYTAFFKMYM 





370 


....I....I. 


KOV5 






KOVSb 






Gil 3100751 






01167542601 






Gil 5434531 






Git 132364971 
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DOMAIN results for N0V5a were collected from the Conserved Domain Database 
(CDD) with Reverse Position Specific BLAST, This BLAST samples domains found in the 
Smart and Pfam collections. The results for N0V5a are listed in Table 51 with the statistics 
and domain description. 

The region from amino acid residue 70 through 351 (SEQ ID NO:10) most probably (E 
- 6e'^^) contains a "seven transmembrane receptor (ibodopsin family) fragmenf domain, 
aligned here with residues 2-254 of the 7tm„l entry (SEQ ID NO:61, see Table 5 J, below) of 
the Pfam database. This indicates that the GPCR5 sequence has properties similar to those of 
other proteins known to contain this domain as well as to the 377 amino acid 7tm domain 
itself. GPCR5b also has identity to the TM7 domain. The regions from amino acid residue 70 
through 235 and from 284-351 (of SEQ ID NO;12) align with amino acid residues of TM7 (E 
= 2.8e-^^). 
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Table SL DOMAIN results for NOV5 



GnllPfainlpfamOOOOl, 7tm_lf 7 transmembrane receptor <rhodopsin family) 
(SEQ ID NO: 61) Length = 377 

Statistics for NOV5: Score - 118 bits (295)^ Expect ^€e-28 ^ ^ 



K0V5 

Gnl I Pf am I pf amOO 001 



NOVS 

Gnl I P£aml pf amOOOOl 



M0V5 

QAl I P£am| pf sinOOOOl 



N0V5 

Qtil I Ffami pfamOOODi 



igyF 




'LRTR-3 
RT-P 



aEV"CDAKiQR( 
NT— TVCl|gDFp| 



LCSvBrI "pfflAAFMRGABH^aGH^P 

'PgEgvl rBvLLPIiVgBlDl|[^^ 



NOVS SGDSV4|SQKl 
QnXlPfamtpfattOOOOl RSLKBg^SsI 



.1. 



NOVS IWKl 
CnllggafflipfaanQOOOl 



The representative member of the 7 transmembrane receptor family is the D2 
dopamine receptor from Bos taurm (SWISSPROT: locus D2DR_B0VIN, accession P20288; 
gene index 1 1 8205). The D2 receptor is an mtegral membrane protein and belongs to Family 
1 of G-protein coupled reciters. The activity of the D2 receptor is mediated by G proteins 
which inhibit adenylyl cyclase. Chioe/ a/., Nature 343:255-269 (1990). Residues 51-427 of 
this 444 amino acid protein are considered to be the r^resentative TM7 domain, shown in 
Table 5 J. 



Table 5J Amino Acid sequence for TM7 (SEQ ID NO:61) 

VTLDVMMCTASILNLCAISXDRYTAVAMPMLYNTRYSSKRRV^ 

LFGLNNTDQNECIIANPAFVVYSSIVSFyVPFIVTLLVYlKIYIVLRRRRKRVNTKRSSR 
AFRANLKAPMGNCTHPEDMKLCTVIMKSNGSFPVNRERVEAARRAQELEMEMLSSTSPP 
BRTRYSPIPPSHHQLTLPDPSHHGLHSTPDSPAKPEKNGHAKTVNPKIAKIFEIQSMPNG 
KTRTSLKTMSRRKLSQQKEKKATQ^^LAIVLGVFIICWLPFFITHIIlNIHCDCNIPPVLYS 
AFTWLGYVKSAVNPI I Y 

The 7 transmembrane receptor family includes a number of different proteins, 
including, for example, hormone, neurotransmitter and hght receptors, all of which transduce 
extracellular signals through interaction with guanine nucleotide-binding proteins. Although 
the activating ligands for this class of proteins vary widely in structure and character, the 
1 5 amino acid sequences for the receptors are very similar Md are beUeved to adopt a common 
structural framework comprising seven transmembrane heUces. Included in this family are 
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searatonin receptors, dopamine receptors, histamine receptors, andresnergjc receptors, 
cannabinoid reciters, angjoteiisin EL receptors, chemoldne receptors, opioid receptors, G- 
protein coupled receptor (GPCR) proteins, olfactory receptors (OR), and the like. Some 
proteins and the Protein Data Base Ids/gene indexes include, for example: rhodopsin (129209); 
5-hydroxytiyptamine receptors; (112821, 8488960, 112805, 231454, 1168221, 398971, 
112806); G protein-coupled receptors (119130, 543823, 1730143, 132206, 137159, 6136153, 
416926, 1169881, 136882, 134079); gustatory receptors (544463, 462208); c-x-c chemokine 
receptors (416718, 128999, 416802, 548703, 1352335); opsins (129193, 129197, 129203); 
and olfactory receptor-like proteins (129091, 1171893, 400672, 548417). 

Based on sequence homology with oth^ serotonin receptors, as well as domain 
information, the disclosed N0V5 protefais likely function as serotonin (S-hydroxytiyptamine) 
receptors. The neurotransmitter serotonin (5-hydroxytryptamine; 5-HT) exerts a wide variety 
of physiologic fimctions through a multiplicity of receptors and ikay be involved in human 
neuropsychiatric disorders such as anxiety, depression, or migraine. These receptors consist of 
4 main groups, 5-HT-l, 5-HT-2, 5-HT-3, and 5-HT4, subdivided into several distinct subtypes 
on the basis of their pharmacologic characteristics, coupling to intracellular second 
messengers, and distribution within the nervous system. Zifa and Pillion, Phann. Rev. 44:401- 
458, 1 992. The serotonorgic receptors belong to the multiS-Hydroxytryptamine Receptor 
family of receptors coupled to guanine nucleotide-bindiag proteins. See, generally, OMIM 
ID: 1 8213 1 and Demchyshyn, et al, Proc. Natl. Acad. Sci. 89:5522-5526, 1 992. 

Potential transmembrane regions of NOV5 include amino acids 48-64 (likelihood - 
12.10), 135-151 (likelihood -0.48), 172-188 (likelihood -4.94), and 300-3 16 (likelihood -9.66). 

The nucleic acids and proteins of N0V5 are useful in potential therapeutic applications 
implicated in various pathological disorders, described further below. For example, a cDNA 
encoding the serotonin receptor-like protein may be useful in gene therapy, and the serotonin 
receptor -like protein may be useful when administered to a subject in need thereof. 

The nucleic acids and proteins of the invention have applications in the diagnosis 
and/or treatment of various diseases and disorders. For example, the compositions of the 
present invention will have efficacy for the treatment of patients suffering from: seizures, 
Alzheimer's disease, sleep disorders, petite disorders, thermoregulation, pain perception, 
hormone secretion and sexual behavior, mental depression, migraine, epilepsy, obsessive- 
compulsive behavior (schizophrenia), and affective disorders as well as other diseases, 
disorders and conditions. 
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The polypeptides can be used as immunogens to produce antibodies specific for the 
invention, and as vaccines. They can also be used to screen for potential agonist and 
antagonist compounds. For example, a cDNA encoding the serotonin recqptor-like protein 
may be usefiil in gene therapy, and the receptor-Uke protein may be useful when administered 

5 to a subject in need thereof. The novel nucleic acid encoding serotonin receptor-Uke protein, 
and the serotonin receptor-like protein of the invention, or j&agments thereof, may further be 
useful in diagnostic ^phcations, wherein the presence or amount of the nucleic acid or the 
protein are to be assessed. These materials are further useful in the generation of antibodies 
that bind immunospecificaUy to the novel substances of the invention for use in therapeutic or 

10 diagnostic methods. These antibodies may be generated according to methods known in the 
art, using prediction from hydrophobicity charts, as described in the "Anti-NOVX Antibodies" 
section below. For example the disclosed N0V5 protein has multiple hydrophilic regions, 
each of which can be used as an immunogen. In one embodimettt, a contemplated N0V5 
epitope is from about amino acids 10 to 40, In another embodiment, aN0V5 epitope is from 

15 about amino acids 1 10 to 130. hi additional embodiments, N0V5 epitopes are fiorn amino 
acids 150 to 175, 190 to 200, 240-270 and from ammo acids 280 to 320. This novel protein 
also has value in development of powerful assay system for functional analysis of various 
human disorders, which will help in understanding of pathology of the disease and 
development of new drug targets for various disorders, 

20 NOV6 

NOV6a 

N0V6a was initially id^tified by searching CuraGen*s Human SeqCalling database 
for DNA sequences that translate into proteins with similarity to a protein family of interest. 
SeqCalUng assembly 21639300 was identified as having suitable similarity. SeqCalHng 
25 assembly 21639300 was analyzed further to identify an open readmg frame encoding for a 
novel full length protem and novel sphce forms of this gene. This was done by extending the 
SeqCalling assembly using suitable additional SeqCalling assemblies, publicly available EST 
sequences and public genomic sequence. PubUc ESTs and additional CuraGen SeqCalling 
assemblies were identified by the Curatools program SeqExtend. They were included in the 
30 DNA sequence extension for SeqCalling assembly 2 1639300 only when sufficient identical 
overlap was found. These inclusions are described below. 

The genomic clone AL121901 was identified as having regions with 100% identity to 
the SeqCalling assembly 21639300 and were selected for analysis because this identity 
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implied that the clone AL121901 contained the sequence of the genomic locus for SeqCalling 
assembly 21639300. 

The genomic clone AL121901 was analyzed by Genscan and Grail to identify exons 
and putative coding sequences/open reading frames. Ihe done AL121901 was also analyzed 
5 by publicly available TblasfN, BlastX, and other homology programs to identify regions 
translating to proteins with similarity to the original protein/protein family of interest. 

The results of these analyses were integrated and manu^dly corrected for apparent 
inconsistencies, thereby obtaining the sequence encoding the fulHengfh protem. When 
necessary, the process to identify and analyze cDNAs/ESTs and genomic clones was reiterated 
10 to derive the fviU-length sequence. This invention describes this fulHength DNA sequence(s) 
and the full-length protem sequence(s) which they encode. This gene belongs to genomic 
clone AL121901 on Chromosome 20. 

The disclosed novel N0V6a nucleic acid of 963 nucleotides (Accession Number 
21639300_EXT, SEQ ID KO: 13) is shown in Table 6A, An open reading begins with an 
15 ATG initiation codon at nucleotides 1-3 and ends with a TAA codon at nucleotides 961-963. 
A putative untranslated region upstream from the initiation codon and downstream from the 
termination codon Jffe underlined in Table 6A, and the start and stop codons are in bold letters. 



Table 6A. NOV6a Nucleotide Sequence (SEQ ID NO:13) 

ATOGCCGGCCCGTQGACCTTCACCCTTCTCTGTGGTTTGCTGGCAGCCACCTTGATCCAAGCCAGCCTCA 
GTCCCACTGCAGTTCTCATCCTCGGCCCAAAAGTCATCAAAGAAAGTCTGACACAGGAGCTGAAGGACCA 
CAACGCCACCAGCATCCTGCAGCAGCTGCCGCTGCTCAGTGCCATGCGGGAAAAGCCAGCCGG^ 
CCTGTGCTGGGCAGCCTGGTGAACACCGTCCTGAAGCACATCACCCCATCCAGGCTGAAGGTCaTCACAG 
CTAACATCCTCCAGCTGCAGGTGAAGCCCTCGGCCAATGACCAGGAGCTGCTAGTCAAGATCCCCCTGGA 
CATGGTGGCTGGATTCAACACGCCCCTGGTCAAGACCATCGTGGAG*rtCCACATGACGACTGAGGCCC^ 
GGCACCATCGGCATGGACACCAGTGCAAGTGGCGCCACCCGCCTGGTCCTCAGTGACTGTGCCACCAGCC 
ATGGGAGCCTGCGCATCCAACTGCTGCATAAGCTCTCCTTCCTGGTGAACGCCTTAGCTAAGCAGGTCAT 
GAACCTCCTAGTGCCATCCCTGCCCAATCTAGTGAAAAACCAGCTGTGTCCCGTGATCGAGGCTTCCTTC 
AATGGCATGTATGCAGACCTCCTGCAGCTGGTGAAGGGTAGGTGCTCTGCTCTCTCTCCCACTTTTTCCT 
TTACTACGGAGCTGGCCTCCAGACCCGGAAAGGTGACCAAGTGGTTCAATARGTCTGCAGCTTCCCTGAC 
AATGCCCACCCTGGACAAGATCGCGTTCAGCCTCATGGTGAGTCAGGAGGTGGTGAAAGCTGCAGTGGCT 
GCTGTGCTCTCTCCAGAfitGAATTCATGGTCCTGTTGGACTC3*GTGGTAAACCTCAGCACAAGGCAGAGAA 
TAGGGCCGCCCAGGCCACATCATAGGAATTTCCTGAACACAGGGTGCCCCTAA 

20 The disclosed nucleic acid sequence has 506 of 660 nucleotides (76%) identical to a 

1 683 bp Mus mmculus von Bbner minor salivary gland protein (GENB ANK- 

ID:MMU46068|acc:U46068) (E value = 4.06-"^^. 

The N0V6a protein encoded by SEQ ID NO: 13 has 320 amino acid residues, and is 

presented using the one4etter code in Table 6B (SEQ ID NO: 14). The SignalP, Psort and/or 
25 Hydropathy profile for NO V6a predict that NO V6a is Hkely to be localized at the lysozyme 

lumen with a certainty of 0.8279, or the lysozyme outside, with a certainty of 0.6138. A 
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cleavage site is indicated at the slash in the sequence TLS-PT, between amino acids 24 and 25 
in Table 6B. The hydropathy profile of the N0V6a sahvary gland protein-like protein 
indicates that this sequence has a strong signal peptide toward the 5' terminal supporting 
extracellular locdi:^ation. It is very likely that the membrane-bound peptide as predicted here 
is similar to the salivary gland protein gene family, some members of which are localized at 
the plasma membrane* Therefore it is likely that this novel gene is available at the appropriate 
sub-cellular localization and hence accessible for the therapeutic uses described in this 
application. 



Table 6B, Encoded NOV6 protein sequence (SEQ ID NO:14). 



MAGEWTFTLLCGLIJ^TLIQATLS/PTAVLILGPKVIKESLTQELKDHNATSILQQLPLLSAMR^ 
FVXiGSLVNTVLKHlTPSIlLKVITANILQLQVKPSAKDQEIjLVKIPLDMVAGBTJTPL\^ 
ATIEmDTSASGPTRLVLSDCATSHGSLRIQLLHKLSFLVNALAKQVMtOiLVPSLPNLVK^ 
NGMYADliLQLVKGRCSALSPTFSrrTELASRPGKVTKWEmSAASLTMPTLpNIPFSLIVSQDVVKAAVA 
AVLgPEEFMVIiLDSVVNLSTRQHIGPPRPHHRNFLNTGCP 'l_ 



The full amino acid sequence of the disclosed N0V6a protein was found to have 164 
of 302 amino acid residues (54%) identical to, and 208 of 302 amino acid residues (68%) 
positive with, the 310 amino acid residue protein von Ebner minor salivary gland protein from 
Mus musculus (SPTREMBL-ACC:Q61 1 14) (E value = 3.4e-72). 



NOV6b 

The N0V6a target sequence identified previously was subjected to the exon linkmg 
process to confirm the sequence. PGR primers were designed by starting at the riiost upstream 
sequence available, for the forward primer, and at the most downstream sequence available for 
20 the reverse primer. In each case, the sequence was examined, walking inward from the 

respective termini toward the coding sequence, until a suitable sequence that is either unique 
or highly selective was encountered, or, in the case of the reverse primer, until the stop codon 
was reached. 

The cDNA coding for the N0V6b (CG5 1 622-02) sequence was cloned by the 

25 polymerase chain reaction (PGR) usmg the primers: 5' 

CCAGCGCCGAATCTTGTGTTGAGT 3* (SEQ ID NO:62) and 5* 
AGAGCGTTGGGTGACGTGAGGACT 3' (SEQ ID NO:63), Primers wei:e designed based 
on in siUco predictions of the fiill length or some portion (one or more exons) of the 
cDNA/protem sequence of d\e invention. These primers were used to ampUfy a cDNA from a 

30 pool containing expressed human sequences derived fix)m the following tissues: adrenal gland, 

bone marrow, brain - amygdala, brain - cerebellum, brain - hippocampus, bram - substantia 
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nigra, brain - thalamus, brain -whole, fetal brain, fetal kidney, fetal liver, fetal lung, heart, 
kidney, lymphoma - Raji, mammary gland, pancreas, pituitary gland, placenta, prostate, 
salivary gland, skeletal muscle, small intestine, spinal cord, spleen, stomack, testis, thyroid, 
trachea and uterus. 

Multiple clones were sequenced and these fragments were assembled togetha:, 
sometimes including public human sequences, using bioinformatic programs to produce a 
consensus sequence for each assembly. Each assembly is included in CuraGen Corporations 
database. Sequences were included as components for assembly when the extent of identity 
with another component was at least 95% over 50 bp. Each assembly represents a gene or 
portion thereof and includes information on variants, such as splice forms single nucleotide 
polymorphisms (SNPs), insertions, deletions and other sequence variations. 

SeqCalling assemblies produced by the exon linking process were selected and 
extended using the jfollowing criteria. Gnomic clones havring rigions with 98% ideaitity to all 
or part of the initial or extended sequence were identified by BLASTN searches using the 
relevant sequence to query human genomic databases. The genomic clones that resulted were 
selected for fiirther analysis because this identity indicates that these clones contain the 
genomic locus for these SeqCalling assemblies. These sequences were analyzed for putative 
coding regions as well as for similarity to the known DNA and protein sequences. Programs 
used for these analyses include Grail, Genscan, BLAST, HMMER, FASTA, Hybrid and other 
relevant programs. 

Some additional genomic regioiis may have also been identified because sel^tcd 
SeqCalling assemblies map to those regions. Such SeqCalling sequences may have 
overiapped with regions defined by homology or exon prediction. They may also be included 
because the location of the fragment was in the vicinity of genomic regions identified by 
similarity or exon prediction that had been included in the original predicted sequence. The 
sequence so identified was manually assembled and then may have been extended using one 
or more additional sequences taken from CuraGen Corporation's human SeqCalling database. 
SeqCalling fragments suitable for inclusion were identified by the CuraTools™ program 
SeqExtend or by identifying SeqCalling fragments mapping to the appropriate regions of the 
genomic clones analyzed. Such sequences were included in the derivation of N0V6b only 
when the extent of identity in the overlap region with one or more SeqCalling assemblies was 
high. The extent of identity may be, for example, about 90% or higher, preferably about 95% 
or higher, and even more preferably close to or equal to 100%, When necessary, the process 
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to identify and analyze SeqCalling fragments and genomic clones was reiterated to derive the 
full length sequence. 

The regions defined by the procedures described above were then manually integrated 
and collected for apparent inconsistencies that may have arisen, for example, from miscalled 
5 bases in the original fragments or from discrepancies between predicted exon jimctions, EST 
locations and regions of sequence similarity, to derive tihe final sequence disclosed herein. 
When necessary, the process to identify and analyze SeqCalling assemblies and genomic 
clones was reiterated to derive the full length sequence. The following public components 
were thus included in the invention: GenBank: gb_AL121901 .20 PRI/HTG Homo 

10 sapiens|Human DNA sequence from clone RPl M9G10 on chromosome 20» complete 
sequence, 161593 bp. 

The DNA and protein sequences for the novel Von Ebner Minor Salivary Gland 
Protein-like gene are reported hero as CuraGen Acc. No. CG516&2-02j or N0V6b* The 
disclosed novel NOV6b nucleic acid of 1035 nucleotides (SEQ ID NO: 15) is shown in Table 

15 6C. An open reading begins with an ATG initiation codon at nucleotides 79-81 and ends with 
a TAA codon at nucleotides 1033-1035. A putative untranslated region upstream from the 
initiation codon and downstream from the termination codon are underlined in Table 6C, and 
ttie start and stop codons are in bold letters. N0V6b differs from N0V6a in the following 
ways: N0V6b has 78 nucleotides at the 5* UTR, and ten base changes or deletions, numbered 

20 with respect to N0V6b; G194 >A; T195 >G; C332 >T; C334> T; C335>A; A336>A; 
T337>A; C338>A; C339>A; A340>A; (where A designates a base deletion). * ^ ^ 



Table 6C. NOV6b Nucleotide Sequence (SEQ ID NO:15) 

AGAGCGTTGGGTCACGTGAGGACTCCAGCGTGGCCAGGTCTGGCATCCTGCACTTACTGC 60" 

CCTCTGACACCTGGQAAG Aa?GGCCGGCCCGTGGACCTTCACCCTTCTCTGTGGTTTGCTG 120 

GCAGCCACCTTGATCCAAGCCACCCTCAGTCCCACTGCAGTTCTCATCCTCGGCCCAAAA 180 

GTCATCAJ^AGAAAAGCTGACACAGGAGCTGAAGGAGCACAACQCCACCAGCATCCTGCA 240 

CAGCTGCCGCTGCTCAGTGCCATGCGGGAAAAGGCAGCCGGAGGGATCCCTGTGCTGGGC 300 

AGCCTGGTGAAC ACCGTCCTGAAGCACATCATCTGGCTGAAGGTCATCACAGCTAACATC 360 

CTCCAGCTGCAGGTGAAGCCCTCGGCCAATGACCAGGAGCTGCTAGTCAAGATCCCCCTG 420 

GACATGGTGGCTGGATTCAACACGCCCCTGGTCAAGACCATCGTGGAGTTCCACATGACG 480 

ACTGAGGCCCAAGCCACCATCCGCATGGACACCAGTGCAAGTGGCCCCACCCGCCTGGTC 540 

CTCAGTGACTGTGCCACCAGCCATGGGAGCCTGCGCATCCAACTGCTGCATMGCTCTCC 600 

TTCCTGGTGAACGCGTTAGCTAAGCAGGTCATGAACCTCCTAGTGCCATCCCTGCCCAAT 660 

CTAGTGAAAAACCAGCTGTGTCCCGTGATC6AGGCTTCCTTCAATGGCATGTATGCAGAC 720 

CTCCTGCAGCTGGT6AAGGGTAGGTGCTCTGCTCTCTCTCCCACTTTTTCCTTTACTACG 780 

GAGCTGGCCTCCAGACCCGGAAAGGTGACCAAGTGGTTCAATAACTCTGCAGCTTCGCTG 840 

ACAATGCCCACCCTGGACAACATCCCGTTCAGCCTCATCGTGAGTCAGGACGTGGTGAAA 900 

GCTGCAGTGGCTGCTGTGCTCrCTCCAGAAGAATTCATGGTCCTGTTGGACTCTGTGGTA 960 

AACCTCAGCACAAGGCAGAGAATAGGGCCGCCCAGGCCACATCATAGGAATTTCCTGAAC 1020 
ACAGGGTGCCCCTAA 1035 
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The disclosed nucleic acid sequence has 538 of 698 nucleotides (77%) identical to a 
1683 bp Mus musculus von Ebner minor salivary gland protein (GENBANK- 
ID:MMU46068jacc:U46068) (E value = 4.0e.^). 

The N0V6a protein encoded by SEQ ID N0:i3 has 318 amino acid residues, and is 
5 presented using the one-letter code in Table 6D (SEQ ID NO: 16). The SignalP, Psort and/or 
Hydropathy profile for NOV6b predict that N0V6a is likely to be localized extracelMarly, 
with a certainty of 0.6138* A cleavage site is indicated at the slash in the sequence TLS-PT, 
betwem amino acids 24 and 25 in Table 6D. N0V6b (Uffers fromNOV6a at five positions: 
S39 >K; T85 >I; P86 >A; S87 >A; R88 >W. 

10 



Table 6B. Encoded NOV6b protein sequence (SEQ D) NO:14). 

AMREKPAGGI PVIiGSLVKTVLKHI IWLKVI TAKILQLQVJKPSAKDQELLVKI^LDMVAGF 120 
MTPLVKTlVEFHm'TEAQATIRMDTSASGFTRLVLSDCATSHGSI^IQLL^^ 180 
AICQVMNLLVPSLPNLVKNQLCPVIEASFNGMYADLLQLVKGRCSALSPTFSFTTELASRF 2*0 
GKVTKWFNNSAASLTMPTLDNIPFSLIVSQDVVKAAVAAVLSPEEFMVLLDSVVNLSTO 300 
RIGPPRPHHRNFLNTGCP 318 ^ 



The fiiil amino acid sequence of the disclosed N0V6b protein was found to have 165 
of 302 ammo acid residues (54%) identical tOj and 209 of 302 amino acid residues (69%) 
positive with, the 3 10 amino acid residue protem von Ebner minor salivary gland protein firbm 
15 Mus musculus (SPTREMBL-ACC:Q61 1 14) (B value ^ l.le-73). 
Patp results include those listed in Table 6E. 



Table 6E. Pa^ alignments of NOV6 

Sequences producing High-scoring Segment Pairs: Smallest 

Sum 

Reading High Prob 

Frame Score P(N) 

Patp :y7 7 12 6 Human neurctransmission-associated protein* . +1 1282 6.5e-130 

patpiY99375 Human PR01357 (UNQ706) amino acid sequence.. +1 1276 Z.8e-129 

patp:Ya6219 Human secreted protein HBHMA23r ... +1 920 1.5e-91 

patp:B58378 Lung cancer associated polypeptide sequence. +1 920 i.5e-91 

patp:B40750 Human ORFX polypeptide sequence ... +1 679 5.2e-66 

patp:Y863lQ Human secreted protein HBHMA23, . . ■ +1 334 3,5e-33 



For example, a BLAST against Y77126, a 484 amino acid neurotransmission- 
20 associated protem from Homo sapiens, produced 275/3 1 0 (88%) identity, and 277/3 10 (89%) 
positives (E = 6.5e-130), with long segments of amino acid identity, WO 00/01821. Y77126 
is described as a putative odorant-binding protein whose cDNA was isolated ftom nasal polyp 
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tissue. N0V6 also has significant homology with a number of secreted proteins. WO 
00/12708; WO 99/66041 ; WO 00/551 80; and WO 00/54873. 

The disclosed N0V6 protein (SEQ ID NO:25) has good identity with salivary gland 
proteins. The identity information used for ClustalW analysis is presented in Table 6F. 



Table 6F. BLAST results for NOV6 




Gene Index/ 
Identifier 


Protein/ Organism 


Length 

(aa) 


Identity 
(%) 


Positives 

<%) 


Expect 


Gi 19938033 1 reflNP 0 
61205,21 


von Ebner minor 
salivary gland 
protein Mus 
muaculua 


474 


175/319 
(54%) 
NOV6a 


222/319 
(68%) 
N0V6a 


7e-80 
Nbv6a 






474 


177/317 
(55%) 
N0V6b 


224/317 
(69%) 
NOV6b 


9e-84 
N0V6b 


Gil 13274680 |einb|CAC 
34050.lt 


novel protein 
similar to mouse 
von Ebner 

salivary gland 
protein, iaoform 
1. ) Homo 3apiGnB 


285 


7&/111 
(71%) 
NOVfa 


81/111 
(72%) 
N0V6a 


le~23 
N0V6a 






285 


79/111 
(71%) 
NOV 6b 


81/111 
(72%) 
NOV 6b 


le-23 
NOV 6b 



This information is presented graphically in the multiple sequence alignment given in 
Table 6G (witti N0V6a being shown on Une 1, and N0V6b shown on line 2) as a ClustalW 
analysis comparing NOV6 with related protein sequences. 
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Table 6G Information for the ClustalW proteins: 



1) NOV6a(SEQIDNO:14) 

2) N0V6b (SEQ ID NO: 1 6) 

3) gi|9789707lgb|AAA8758 1 .2| (U46068) von Ebner minor salivary gland protein [Mus musculus] (SEQ ID NO:64) 

4) gi|13274680|emb|CAC34050,i| (AL355392) dJl 187J4,1.1 (novel protein similar to niousc von Ebner salivary 
gland protein, isoform 1.) [Homo sapiens] (SEQ ID NO:65) 



NOV6 

gi I 9789707 I 
git 13274680 1 



N0V6 

NOVCb* 

gi I 9769707) 

gl 1 13274680 1 



ifove 

H0V6b* 
g± 1 9789707 1 

git 132746801 SAjgEUiKSSl 



»0V6 
NOVfib* 
gil 9789707 1 
gi 113274680 1 



NOV6 

W0V6b* 

gil 9789707 1 

gi|132746S0| 



WC3V6 
»0V6b* 
gil 9789707 I 
gi 1 13274680 1 





|PDVA|n|QMDIK5IKAEAANKLGPTQMLKIFTHSTPHIVIiNEGSARAA 
•ASi 



390 



NOV6 
HOVEb* 

gi I 9789707 | QSWLEVFjj^DV 
gi 113274680 1 




400 



410 



420 



IIEASYEAQFFTEDNRLMLNFMNVSIERIKLMISDIKLFD 



10 The presence of identifiable domains in N0V6 was determined by searches using 

algorithms such as PROSITE, Blocks, Pfam, ProDomain, Prints and then determining the 
Merpro number by crossing the domain match (or numbers) using the Interpro website 

(http:www.ebi.ac.tik/interpro/)* 

DOMAIN results for N0V6 were collected from the Conserved Domain Database 
15 (CIDD) with Reverse Position Specific BLAST. This BLAST samples domains found in the 
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Smart and Pfam collections. The results are listed in Table 6H with the statistics and domain 
description. The results indicate that this protein contains the BPI/LBP/CTEP N-t^minal 
domain, bactericidal permeability increasing protein/lipopolysaccharide-binding protein/ 
cholesteryl ester transferase domain. The von Ebner minor salivary gland protein also 
5 contains this domain. Ammo acids 35-243 N0V6a aUgns with amino acids 2-206 of tWs 225 
residue protein, bs indicated in Table 6H as SEQ ED NO:66. The E value for N0V6a is 8©-l 5, 
and 6e-15 for NOV6b, This indicates that the sequence of N0V6 has properties similar to 
those of other proteins known to contain this domain. 



Table 6H, DOMAIN results for NOV6 

gnllSmartlBPIl/ BPI/lbp/CETP N-tentdnal domain <SEQ ID NO: 66) 
K0V6a: CD-Length « 225 residues, 91.1% aligned 

score - 74.3 bits (IBl), Expect - 8e-15 
N0V6b: CD-Length - 225 residues, 91.1% aligned 

Score - 74.7 bits (182), Expect - 6e-15 ] 



N0V6 

Gnl|Stta£t|B)PIl 



H0V6 

(^ISmartlBFIl 



MOVfi 

qnljSiurtlBPIl 



WOV6 

Cfana|9inart|BPIl 




nave I^nsaasltmptld 

OSiX I Bmart I BPIl HVRLKGKFrWKHH 

10 

Von Ebner glands (VEG) are small lingual salivary glands. Their ducts open into 
trenches of circiimvaJlate and foUate papillae, and their secretions influence the milieu where 
the interaction between taste receptor cells and sapid (taste-processing) molecules takes place. 
The major secretion of human VEG is a protein with a molecular mass of 18 kD. This VEG 

15 protein is identical to lipocalin-1, Blaker et al. isolated acDNA clone fiom ahuman VBG 
Ubrary and showed that it contained an insert of 735 bp, includng an open reading frame that 
encodes the human VEG protein of 176 amino acids. Blaker et al., Biochim. Biophys. Acta 
1172:131-137, 1993, The VEG proteins are members of the lipocalin protein superfamily; 
together with odorant-bioding protein, they constitute a new subfamily. Sequence similarity to 

20 proteins such as retinol binding protein and odorant binding protein suggests a possible 
fimction for the human VEG protein in taste perception* 
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Lipocalins are a group of extracellular proteins, first described by Pervaiz and Brew 
that are able to bind lipophiles by enclosure within their structures, minimizing solvent 
contact Pervaiz and Brew, FASEB J. 1;209"214, 1987. The lipocalins make up a 
heterogeneous superfamily of proteins. Although showing almost no sequence homology, 
5 they share very similar secondary and tertiary structures. Their ability to bind hydrophobic 
ligands is well established, but the physiological function of most lipocalins remains unclear. 
The lipocalin from iiie human Von Ebner's Gland of the tongue (VEGh) contains three 
sequence motifs corresponding with the p^ain-binding domains of cystatins, a family of 
naturally occurring cysteine proteinase inhibitors. VEGh was shown to inhibit papain activity 

10 to a similar extent as salivary cystatin S. Furthermore, synthetic peptides derived from VEGh 
and cystatin comprising these three motifs, inhibited papain, too. VEGh is a physiological 
inhibitor of cysteine proteinases and therefore can play a role in the control of inflammatory 
processes in oral and ocular tissues, Van't Hoff, et al. J. Biol. CKiem. 272; 1837-1 841, 1997. 
Furthermore, Redl et al, found enhanced LCNl secretion in the airways of patients 

15 with cystic fibrosis, Redl, et al,, Lab, Invest. 78: 1 12M 129, 1998. Northern blot analysis of 
RNA fi"om normal trachea and RNA isolated firom tracheal biopsies of patients with. CF 
indicated that the enhanced secretion was due to an upregulated expression of the LCNl gene. 
Thus, these investigators presented the first clear evidence that LCNl is induced in infection 
or inflammation and supported the idea that this Upocalin functions as a physiologic protection 

20 factor of epithelia in vivo. 

NOV6 has been analyzed for tissue expression profiles, The quantitative expression of 
various clones was assessed using microtiter plates containing RNA samples from a variety of 
normal and pathology-derived cells, cell lines and tissues using real time quantitative PGR 
(RTQ PGR; TAQMAN^. RTQPCR was performed on a Perkin-Ehner Biosysteras ABI 

25 PRISMtg) 7700 Sequence Detection System, Various collections of samples arc assembled on 
the plates, and referred to as Panel 1 (containing cells and cell Unes from normal and cancer 
sources). Panel 2 (containiag samples derived from tissues, in particular from surgical 
samples, from normal and cancer sources). Panel 3 (containing samples derived from a wide 
variety of cancer sources) and Panel 4 (containing cells and cell lines from normal cells and 

30 cells related to inflammatory conditions). See Taqman Example, 

As shown in Table 61, below, this 96 well plate (4 control wells, 92 test samples) for 
panel 1.2, and its variants are composed of RNA/cDNA isolated from various human cell lines 
that have bera established from normal and malignant human tissues. These cell lines have 
been extensively characterized by investigators in both academia and the commercial sector 
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regarding their tumorgenicity, metastatic potential, drug resistance, invasive potential and 
other cancer-related properties. They serve as suitable tools for pre-clinical evaluation of anti- 
cancer agents and promising therapeutic strategies. 

As shown in Table 26 below, panel 4 includes samples on a 96 well plate (2 control 
wells, 94 test samples) composed of RNA (Panel 4r) or cDNA (Panel 4d) isolated flxjm 
various human cell lines or tissues related to inflanunatory conditions. 

TaqMan oligo set Ag719 for the N0V6 gene include the forward probe and reverse 
oligomers. Sequences for the oligos are shown in Table 61. 



Table 61: Taqman primers 



Position 


Primers 


Sequences 




Length 


260 


Forward 


5'- CCAGGCTGAAGGTCATCAC -3* 


SEQ ID NO: 67 


19 


281 


Probe 


FAM-5'" CTAACATCGTCCAGCTGCAGGTGAAG 
-3 * -TAMRA 


SEQ ID NO; 68 


26 


316 


Reverse 


5'- GACTAGCAGCTCCT6GTCATT -3' 


SEQ ID NO: 69 


21 , 



Table 63: TaqMan Results 



PANEL 1.2 


Panel 4D 


Tissue Name 


% Rcl. 


%Rel. %Rel. 


Tissue Name 


%RoL 


%ReL 


%ReI. 




Expn. 


Expn. 


Expn. 




Expn. 


Expn. 


Expn. 




Run 1 


Run2 


Run 3 




Run 1 


Run2 


Runs 










33768_Sccondary 'nil_anti- 


0,0 


0,0 


0.0 


Ehdotheiia! cells 


0.0 


0.0 


0.0 


CD28/anti-CD3 

93769 Secondary Th2 anti- 


0.0 


0.0 


0,3 


Endothelial cells (treated) 


0.0 


0.0 


0.0 


CD28/anti-CD3 














93770 Secondary Trl anti- 


0.0 


0.0 


0.0 


Pancreas 


ai 


0.0 


0.0 


CD28/anti-CD3 
93573„Secondary IliLresting 


0.0 


0.0 


0,0 


Pancreatic ca.CAPAN 2 


0.0 


0.0 


0.0 


day 4-6 in IL-2 

93572_Secoadary Thi^restiflg 


0.0 


0.0 


0.0 


Adrenal Gland (new lot*) 


0.0 


0.0 


0.0 


day 4-6 in IL-2 






0.0 








93571 Secondary Trl resting day 


0.0 


0-0 


TTiyroid 


O.l 


0.0 


0.0 


4^6 in IL-2 


0.5 




0.0 








93568_primaTyThl anti- 


0.0 


Salavaiy gland 


1.6 


0.8 


1.8 


CD28/anti-CD3 






0.0 








93569 jMrimaryTh2 anti- 


0.5 


0.0 


Pituitaiy gland 


0,3 


0.0 


0.0 


CD28/anti-CD3 


0.0 


0,0 


0.0 








93570_priniary Tf l_anti- 


Brain (fetal) 


0.0 


0.0 


0.0 


CD28yanti-CD3 














93565_iffiniary Thl_resthig dy 4- 


0.3 


0.0 


0.0 


Brain (whole) 


0.0 


0,0 


0.0 


6 in IL-2 






0.0 








93566_j)rimary Th2_resting dy 4- 


0.0 


0.0 


Brain (amygdala) 


0.0 


0.0 


0.0 


6 in IL-2 




0.0 


0.0 








93567_priTnaiy TYLrcsting dy 4^ 


0,0 


Brain (cerebellum) 


0.0 


0.0 


ao 


in IL-2 






0.0 








93351_CD45RACD4 


0.0 


0,0 


Brain (hippocampus) 


0.0 


0.0 


0.0 


lymphocyte_anti-CD28/anti-CD3 


0.0 


0.0 


0.0 








93352_CD45ROCD4 


Brain (thalamus) 


0.0 


0,0 


0.0 


lyniphocyte_anti-CD28/anti-CD3 


0.0 


0,0 


0.0 








93251_CD8 Lymphocytes^anti- 


Cerebral Cortex 


0.0 


0.0 


0,0 


CD28/anti-CD3 






0.0 








93353_chronic CD8 Lymphocytes 


0.0 


0.0 


Spinal cord 


OA 


0.0 


0.1 


2ry restiiig dy 4-6 in IL-2 


0.2 


0.0 


0.0 


CNS ca. (gUo/astro)U87- 


0.0 


0.0 


0,0 


93574_ohfonic CD8 Lymphocytes 
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PCT/US01/1O039 



MG 








Zfy_activatcd CD3/CD28 








CNSca. {glio/astro)U-118- 








?3354^CD4_none 


0.0 


0.0 


o.o 


MG 


0.0 


0.0 


ao 


)3252_Secondary 


ao 


0.4 


0.0 


CNS (astro) SW1783 
CNS ca* (neuro; met ) SK- 


0.0 


0-0 


0.0 


n5lAln2/lrI_aiiti-k>uy5 Lnii 














'3 J03_LAK ceJls_TC6tmg 




U.U 


U.U 


N-AS 


0.0 


0.0 


0.0 










CNS ca. (astro)SF-539 


0,0 


0.0 


0.0 


)3788 LAK cells 11^2 


0.2 


ao 


O.O 


CNS ca, (astro) SNB-75 


0.0 


0.0 


ao 


?3757^LAK ceLls_IU2+IL-12 


0.0 


0.0 


A A 

u*u 








?3789__LAK cells_IL-2+IFN 


0.3 


ao 


0,2 


CNS ca. (glio)SNB-19 


0.0 


0.0 


ao 


jamma 








CNS ca.(gUo) U251 


0.0 


ao 


ao 


w7yu_LAK ceUS_lL*-2+ LL-io 




U.U 


O.u 








?3104_LAK 


ao 


0.0 


O.O 


CNS ca. (glio) SF-295 


0.0 


ao 


ao 


ceils PMA/ionomycinandIL-18 








Heart 


0.0 


ao 


ao 


?3578_NK Cells n^2_^resting 
93 109_Mixed Lymphocyte 


ao 

0.0 


ao . 
ao ^ 


0.0 

o.o 


Skeletal Muscle (new lot*) 


0,0 


ao 


0.0 


Rcaction_Two Way MLR 














93 1 1 0_Mixed Lymphocyte 


0.0 


0.0 


0.0 


BonemaxTPW 


0.0 


ao 


0.0 


leaction_Two Way MLR 
93 1 1 l^Mixed Lymphocyte 


ao 


ao 


0.0 


Thymus 


0.0 


ao 


0.0 


Reaction_Two Way MLR 














93 1 1 2_Monomiclear Cells 


ao 


0.0 


0.0 


Spleen 


0.0 


0.0 


0.0 


CPBMCs)_restiiig 














93 1 13_Mononuclcar Cells 


ao 


0.0 


0.0 


Lymph node 


0.3 


0.1 


0^2 


J*BMCs)JPWM 














Jjxl*T IVlUIIUilUljlCOl ViCllS 


0*0 


0.0 


0.0 




0.0 


0.0 


0.0 


(PBMCs) PHA-L 








Stomach 


23 


3.4 


7.3 


93249 Ramos (B cell) none 


0.0 


ao 


0.0 


Small intestine 


0.0 


ao 


ao 


93250_Ramos (B ccll)Jonomycin 


ao 


ao 


0.0 


Colon ca, SW480 


0.0 


ao 


0.0 


93349_B lymphocytes^PWM 


ao 


ao 


0.0 


Colon C8,*(SW480 








93350 Blymphoytes CD40Land 


ao 


ao 


0.0 


met)SW620 


0.0 


ao 


0.0 


[L-4 " 














92665_BOL-l 
(Eosinophil)_dbcAMP 


ao 


ao 


ao 


Colon ca. HT29 


0.0 


ao 


ao 


differentiated 
93248_BOL.l 

(EoBinophil)_dbcAMP/PMAiono 


0.0 


ao 


0.0 


Colon ca, HCT-116 


0.0 


0.0 


0.0 


mycin 








Colon ca. CaCo-2 


0.0 


ao 


ao 


y335o_Denantic LeiisjDone 








832 1 9 CG Well to Mod Diff 








93355_Dendritic Cells^LPS 100 


ao 


0.2 


ao 


(OD03866) 


0.0 


0.0 


0.0 


ng/ml 






6.0 


Colon ca,HCC-2998 


0.0 


0.0 


0.0 


93775_Dendritic Cc«s_aati-CD40 


ao 


0,0 


Gastric ca,* (liver met) NCI- 








93 774_Monocytes_restiiig 


0,0 


U.U 


A rt 

U.U 


N87 


0,1 


0.0 


0.1 










Bladder 


0.0 


ao 


ao 


93776_Monocyte8_LPS 50 ng^ml 


0.3 


A A 
U.O 


A A 
U.U 


Trachea 


1 00.0 


loao 


100.0 


93581_Macrophages„resting 
93582_Macrophages_LPS 100 


ao 
ao 


0.0 

ao 


0.0 

ao 


Kidney 


0.0 


0.0 


ao 


ng/ml 














93098_HUVEC 


ao 


ao 


0.0 


Kidney (fetal) 


0.0 


0.0 


0.0 


(Endothelial) none 














93099_HUVEC 


0.0 


0.4 


0.0 


Renal ca. 786-0 


0.0 


ao 


ao 


(Bndoriielial)_starved 

93100 HUVEC (Endothelial) IL- 


ao 


ao 


ao 


Renal ca. A498 


0.0 


0.0 


0.0 


Ib 

93779_HUVEC 


0.0 


0.0 


as 


Renal ca. RXF 393 


0.0 


0.0 


0.0 


(Endothelial)_IFN gamma 
93102_HUVEC 

(BndoSielial)_TNF alpha + IFN ^ 


as 


ao 


0.0 


Renal ca. ACHN 


0.0 


ao 


0.0 


gamma 

9310LHUVEC 


o.o 


ao 


0.0 


Renal ca, UO-31 


0.0 


0.0 


ao 


(BndoSielial) TNF alpha +IL4 
93781 HUVEC (aidothelial) IL- 


0.0 


ao 


0.0 


Renal ca.TK-10 


0.0 


0.0 


0.0 


11 

93583_Lung Microvascular 


0.0 


0.3 


0.0 


Liver 


0.0 


0.0 


ao 


Endothelial Cclla_none 
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Liver (fetal) 


0.0 


0.0 


Liver ca. (hcp&toblast) 


0.0 


0.0 


HepG2 


LuDg 


5.2 


2.3 


bung ^retai; 


1.0 


0.3 


r iTim rn /'email CeinLX-l 


0.0 


0.0 


Lung ca. (small cell) NCI- 






H69 


0.0 


0.0 


Lung ca. (s.ccll var.) SHP-77 


0.0 


0.0 


Lung ca. (large ceU)NCI- 






H460 


0.0 


0*0 


Lung ca. (nonnsm. cell) A549 


0.1 


0.0 


Lung ca. (non-s.cell) NCI- 






H23 


0.0 


0.0 


Lung ca (non-s.cell) HOP-62 


0.0 


0.0 


Lung ca. (non-s.cl) NCI- 






H522 


ao 


0.0 


Lung ca. (squam.) SW 900 


0.3 


0.1 


Lung ca. (Bquam.)NCI-H596 


0.0 


0.0 


Mammary gland 


0.1 


0.0 


Breast ca.* (pi efftision) 






MCF-7 


0.0 


0.0 








MB^231 


0.0 


0.0 


Breast ca.* (pi. effbsion) 






r47D 


u.u 


ft ft 


Breast ca. BT-549 


0,0 


0,0 


Breast oa. MDA-N 


0.0 


0.0 


Ovary 


0.0 


0.0 


Ovarian ca,OVCAR-3 


0.0 


0.0 


uvanan cb.vjvl^aiv-t 


0.0 


0 0 


Ovarian ca.OVCAR*5 


0.9 


0.3 


Ovarian ca. OVCAR^S 


o.o 


0.0 


Ovarian ca.IGROV-1 


0,0 


0.0 


/"ViraTnan na W 1 ac/*1'f AG) Qk 
WValVall vHt {aoMiCO/ 






OV-3 


0,0 


0.0 


Uterus 


0.0 


0,0 


Plancenta 


0.0 


0.0 


Prostate 


0.0 


0.0 


Prostate ca,* (bone met)PC-3 


0,0 


0.0 


Testis 


0.2 


0.0 


Nidanoma Hs688(A).T 


0.0 


0.0 
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c 


'3584 Lung Microvascular 


0.0 


0.0 


0.0 


1 


Endothelial Cells_TNFa (4 ng/ml) 








0.0 E 


Old ILIb ng/ml) 








c 


^2662 Mtcfovascular Dermal 


0.0 


0.0 


0.0 


0.0 E 


Tadotheliumjione 








c 


>2663_Micr^vasular Dermal 


0.3 


0.0 


o.b 


E 


jndothelium_TNFa(4 ng/ml) and 








4.9 1 


Lib (1 ng/ml) 








t 


J3773 Bronchial 


0.0 


0.0 


0.0 


E 


epithelium TNFa (4 ng/ml) and 








0.6 1 


Lib (1 ng/ml)** 










>3347_Small Airway 


0.0 


0.2 


0.0 


0.0 


Ipitheliumjione 










)3348 Small Airway 


0.0 


0.2 


0.0 




BpltheTium_TNFa (4 ng/ml) and 








0.0 


[Lib (1 ng/ml) 










?2668_Coroneiy Artery 


0.0 


0.0 


0.0 


0.0 


3MC_rcsting 




0.0 






)2669 Coronery Artery 


0.0 


0.0 




SMC^INFa (4 ng/ml) and ILlb (1 








0.0 


[ig/ml) 




0,2 


0,0 


01 


?3 1 07_a5trocytes„resting 


0.0 




?3108 astrocytes TNFa (4 ng/ml) 


0.5 


0.0 


0.0 


0.0 


and TLlb CI ndtsA) 










92666 KU-812 i 


0.0 


0.0 


0.0 


n n 

u.u 








0.0 




92667_KU-812 


0.0 


0.0 


0,0 


'BasonhiU PMA/ionovcin 










?3579_CCDU06 


0.0 


0.0 


0.0 


0.4 


[Keral5iocytes)_noiie 










93580_CCU110d 


ft 1 


A ft 


ft n 




[Keratmocytes)_TNFa and IFNg 








0.0 










0.0 


9379LLiver Cirrhosis 


1.6 


2.3 


1,9 




93792_Lupus Kidney 


0.3 


0.0 


0,0 


0.0 


93577_NCI'H292 


88.9 


78.5 


86.5 


0.0 


9335 8_NCI-H292_IL-4 


64.2 


100,0 


5o.o 


0.0 
0.0 


9^*^60 NCI-H292 11^9 


68.8 


54.0 


55.1 


0.0 


93359_NCI-H2y /^iL"l 3 


Ay ft 






0.0 


93357_Na-H292_IFN gamma 


30.6 


10.4 


19.2 


0,0 


93777_HPAEC_- 


0.0 


0.0 


0,0 


93778 HP ABC IL-1 beta/TNA 


0.0 


0.0 


0.0 


0.0 


alpha 










9'^254 Normal Human Luniz 


0.0 


0.0 


0.0 




FiTtrnhlast tiOtie 










93253 Normal Human Lung 


0.2 


0.0 


0.0 




Fihrohlflst TNFa ^4 nsf/ml^ and 








0.0 


[L-lb n njz/ml'y 






0.0 




93257 Normal Human Lung 


0.0 


0.0 


0.0 


FibrobTast_IL4 










93256 Nonnal Human Lung 


0.1 


0.0 


0.0 


0.0 


FibroblaBt„IL-9 










93255 Normal Human Lung 


0,0 


0.0 


0,0 


0.0 


FibrobTast_IL-13 










93258_Normal Human Ltmg 


0.0 


0,0 


0.0 


0.0 


Fibroblast IFN gamma 










93106 Drnnal Fibroblasts 


0.0 


0.0 


0.0 


0.0 


CCDl070_resting 










93361 Dermal Fibroblasts 


0.0 


0.0 


0.0 


0.0 


CCD1070^TNF alpha 4 ng/ml 






0,0 




93105 Drnial Fibroblasts 


0.0 


0,0 


0.1 


CCD1070_IL-1 beta 1 ng/ml 






0.0 




93772_deiTnal i^oblast_IFN 


0.0 


04 


0.0 


gamma 
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Melanoma* (met) 








93771_dermal fibroblast^lL^ 


0.0 


0.0 


n rt 
Q.O 


Hs688(B).T 


0.0 


0*0 


piO 










Melanoma UACC-62 


0.1 


0-0 


0.1 


93259_IBD Colitis I** 


1.6 


1.2 


1.4 


Melanoma Ml 4 


0.0 


0.0 


0.0 


93260_IBD Colitis 2 


0,0 


0.0 


O.O 


Melanoma LOXIMVI 


0.0 


0,0 


0.0 


93261_IBD Crohns 


L3 


3.3 


0.4 


iNaeianoma^ \mci^ oJS,-ftiiiL- 










0.3 


0 0 


1.4 


5 


0.0 


0.0 


0.0 










Adipose 


0.0 


0.0 


0.2 


73501 9_Lung_none 


100.0 


72,2 


100.0 








54028- l_Thyrau3_none 


1.0 


04 


0.0 










64030'l_Kidney_none 


0.8 


0.4 


0.8 



In Table 6J the following abbreviations are used: ca, » carcinoma, * =^ established from metastasis, met « metastasiSj a cell var 
- stnall cell variant, non-s = non-sm =non-small, squam = squamous, pi eff- pi efifUsion = pleural eflEueion, =■ glioma^ 
astro *- astrocytoma, and neuro = neuroblastoma. 



5 The results from Panel 1 .2 indicate that N0V6 is expressed in normal trachea, salivary 

gland and l^mg, but N0V6 is not expressed on any tumor tissues* The results from panel 4D 
indicate that NOV6 is expressed highly in lung and in the lung airway epithelial cell line NCI- 
H292, and that with treatment with gamma interferon reduces N0V6 expression 3-10 fold in 
these cells. NOV6 is expressed in normal airway tissue such as ftie hmg and trachea and 

10 expression is down regulated in gamma interferon treated tissues. The reduction in N0V6 
may contribute to the inflammatory processes in the airways due to allergy/asthma, 
emphysema or viral infection. Protein therapeutics derived from N0V6 might reduce or 
eliminate inflammation in the limg due to asthma/allergy, emphysema, or viral infection. 
Since it is known that gamma int^eron treatment stimulates the expression of the cell 

15 adhesion molecule ICAM-1 on NCI-H292 cells, it is possible that treatment with N0V6 would 
prevent the expression of cell adhesion molecules and reduce or prevent leukocyte infiltration 
into the lung. See, e.g., Togas, et al.. Euro J Pharmacol 345:199-206, 1998. - ^ 

The similarity faiformation for the N0V6 protein and nucleic acid disclosed herein 
suggest that NOV6 may have important structural and/or physiological fimctiona characteristic 

20 of the salivary gland protem family* Therefore, the nucleic acids and proteins of the invention 
are useful in potential diagnostic and therapeutic applications and as a research tool These 
include serving as a specific or selective nucleic acid or protein diagnostic and/or prognostic 
marker, wherein the presence or amoimt of the nucleic acid or the protein are to be assessed, as 
well as potential therapeutic applications such as the following: (i) a protein therapeutic, (ii) a 

25 small molecule drug target, (iii) an antibody target (therapeutic, diagnostic^ drug 

targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene 
ablation), and (v) a composition promoting tissue regeneration in vitro and in vivo (vi) 
biological defense weapon* The novel nucleic acid encoding N0V6, and the N0V6 protein of 
the invention, or fragments thereof, may further be useful in diagnostic applications, wherein 

30 the presence or amount of the nucleic acid or the protein are to be assessed* 
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The nucleic acids and proteins of the invention are useflil in potential diagnostic and 
therapeutic applications implicated in various diseases and disorders described below and/or 
other pathologies. For example, the compositions of the present invention will have efficacy 
for treatment of patients suffering ftom olfactory disorders, salivitory disorders, digestive 

5 disorders, oral immunologic disorders, poor oral health, inflammatory processes in the airways 
due to allergy/asthma, emphysema or viral infection, cystic jSbrosis, obesity and/or other 
pathologies and disorders of the like. 

The polypeptides can be used as immunogens to produce antibodies specific for the 
invention, and as vaccines. They can also be used to screen for potential agonist and 

1 0 antagonist compounds. For example, a cDNA encoding the salivary glmd-like protein may be 
useful in gene ther^y, and the salivary gland-like protein may be usefiil when administered to 
a subj ect in need thereof By way of nonlimiting example, the compositions of the present 
invention will have efficacy for treatment of patients suffering Mm bacterial, fungal, 
protozoal and viral infections, olfactory disorders, salivitory disorders, digestive disorders, 

15 oral immunologic disorders, poor oral health, inflammatory processes in the airways due to 
allergy/asthma, emphysema or viral infection, cystic fibrosis, obesity and/or other pathologies 
and disorders of the like. 

The novel nucleic acid encoding saUvary gland-Uke protein, and the salivary gland-like 
protein of the invention, or fragments thereof, may further be useful in diagnostic applications, 

20 wherein the presence or amount of the nucleic acid or the protein are to be assessed. 

These materials are fiirther useful in the generation of antibodies that bind immuno- 
specifically to the novel N0V6 substances for use in therapeutic or diagnostic methods. These 
antibodies may be generated according to methods known in the art, using prediction from 
hydrophobicity charts, as described in the "Anti-NOVX Antil)odies" section below. In one 

25 embodiment, a contemplated N0V6 epitope is from about aa 25 to 65. In another 

embodiment, a N0V6 epitope is from about aa 95 to 105. M additional embodiments, N0V6 
epitopes are from aa 135 to 160, 225-260, and from 290 to 310. 

NOV7 

A novel nucleic acid was identified on chromosome 1 1 by TblastN using CuraGen 
30 Corporation's sequence file for CD-81 or homolog as run against the Genomic Daily Files 

made available by GenBank or from files downloaded from the individual sequencing centers. 
The nucleic acid sequence was predicted from the genomic file GenBank Accession Number: 
AC016702, by homology to a known CD-81 or homolog. Exons were predicted by homology 
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and the intron/exon boundaries were determined using standard genetic rules, Exons were 
further selected and refined by means of similarity determination using multiple BLAST (for 
example, tBlastN, BlastX, and BlastN) searches, and, in some instances, GeneScan and Grail. 
Expressed sequences from both public and proprietary databases were also added when 

5 available to further define and complete the gene sequence. The DNA sequence was then 
manually corrected for apparent inconsistencies thereby obtaining the sequences encoding the 
full-length protein 

The disclosed NOV7 nucleic acid of 754 nucleotides (also referred to as 
GM_5 1 624520_A, or CG54665-01) is shown in Table 7A. An open reading begins with an 

10 ATG initiation codon at nucleotides 5-7 and ends with a ATG codon at nucleotides 746-748. 
A putative untranslated region upstream fix)m the initiation codon and downstream fix)m the 
termination codon are underlined in Table 7A, and the start and stop codons are in bold letters. 

Table 7A- NOV7 Nucleotide Sequence (SEQ ID NO:17) 

CACCA T^C^AAGGCa^CTgrerGAGerGCATGAAGTAT^CT 

GGCXKSGGCCTGCCTGCTGGCCATCaaCATCTGaaTCATaGTGQACCGCACCGGCOT 

CTGCCy^TO^TC^rOCTCCTCACGGGCGCCTACATCCrrCGTGGC 

CTTCCTGGGCTGCTGCGGGGCCGTCaSTGAGAACAAGTGTCT 

ATCATCTTCCTGGCAGAGCTCTCAGCAGCCATCCTGGCCTTCATCTTCAGGGAAAATCTCACCCGAGAAT 
TCTTCACCAAOGGGCTCyiCay^GCACTACCAGaGC^ 

CTCGGTCATGATCACATTTGGTTGCTGCGGGQTCAACGGGCCTGAAGACTTTi^ 

GTQAAGAGGTGCCGGCGCCTGCTGCCGGAGGAACCCCAAAGXCGGCAC<X5GGTCCTGCTGAGCCGGGAGG 
AGTGGCTCCTGGGAAGGAGCCTATTCCTAAACAAGCmQCAGGGCTGTTACACGGTGATCCTCAAC^ 
GGAGACCTACGTCTACTTGGCCGGAGCCCTTGCCATCGGGGTACTGGCCATCGAGGTATTTCGCC^ 
CTTTGCCATQ TQCGTCTTCCQGGGCATCCAGTAGAQQQTATQQCCTQAAQCCTG 

15 The disclosed nucleic acid sequence has 512 of 711 bases (72%) identical to a 935 bp 

Gallus gallus CD-81 mRNA (gb:GENBANK-ID:AF206661|acc:AF206661 Gallus gallus 
neuronal tetraspanin mBKA, complete cds) (E value = 2.46-^^*). 

The N0V7 protein encoded by SEQ ID NO: 17 has 247 amino acid residues, and is 
presented using the one-letter code in Table 7B (SEQ ID NO: 18). The Sign^, Psort and/or 

20 Hydropathy profile for N0V7 predict that N0V7 has a signal peptide and is likely to be 
localized at the plasma membrane with a certainty of 0.6400. The SignalP shows a signal 
sequence is coded for in the first 28 amino acids, te.y with a cleavage site at the slash in the 
sequence ACL-LA, between amino acids 27 and 28 in Table 7B. 



68 



wo 01/74851 



PCT/USOl/10039 



Table 7B. Encoded N0V7 protein sequence (SEQ ID NO:18). 



F 

IiQCCOAVRENKCLLIiFFFLFILIIFMELSAAIliAPIFRENLTOEPPTKQljTKHrQGNWDT^^ 

WITFGCGGVNGPEDFKFAPWIVKRCRRLIiPEEPQSRPQVIiLSREECLItQ 

XyVYLAGAXAIGVLAIEVFRHDLCHVPLPGHPVEGMA 



The fiill amino acid sequence of flia protein of the invention was found to have 180 of 
234 amino acid residues (76%) identical to, and 199 of 234 residues (85%) positive with, the 
247 amino acid residue neuronal tetraspanin protein from Gallus gallus (ptnr: 
TREMBLNEW-ACC:AAF19031) (E value - 2.0e-^^). 

Patp results include those listed in Table 7C. 



Table 7C. Patp alignments of NoV? 


Sequences producing High-scoring Segment Pairs: 


Smallest 






Sum 


Reading 


High 


Prob 


Frame 


Score 


P(N) 


Patp:B49503 Clone HCE1K90 #1 - Homo sapiens, 248 aa* +2 
Patp:B49509 Clone HCE1K90 #2 - Homo sapiens, 164 aa_ +2 
Patp :W61 618 Clone HPWAE25 of TM4SF superf ainily H. sapiens +2 


1080 1 
835 1 
328 8 


7e-108 

5e-82 

le-29 



10 For example, NOV? shows good homology with two receptor proteins from the 4 

transmembrane superfamily (B49503 and B49509). PCT application WO 00/70076. The 
alignments of with these proteins are shown in Table 7D and 7E. 



Table 7D. AUgnment of NOV7 with B49503 (SEQ ID NO:70). 



Length - 248 Plus Strand HSPs: 
Score « 1080 {380.2 bita) , Expect - 1.7e--l08, P - 1.7e-108 
Identities = 218/235 (92%), Positives ^ 220/235 (93%), Frame « +2 



H0V7! 
B49503 
N0V7: 



1 MEGDCliSCMKYLMFVFNFFIFLGGAGLIAIGIWVMVDPTGBllEIVAANPLLLTGAYILIiA 60 

llliMlllllllltltllllllllllllltlltllllllMltllltllllllltlll) 

1 MEGDCLSCMKYLMFVFNFFIFLGGACLIAIGIWVWDPTGFREIVAANPLLLTGAYILLA 60 



61 MGGLLFLLGFLGCCGAVRENKCLLLFFFLFILIIFLAELSAAILAFIFRENLTREFFTKG 120 
lIlllllttlllllltllilllllltlllilllllllllllllllllMllllllltll 
B4 9503: 61 MGGLLFLLGFLGCCGAVRENKCLLLFFFLFILIIFLAELSAAIIAFIFRENLTREFFTKE 120 

N0V7; 121 liTKHYQGNNDTDVFSATWNSVMlTFGCCGVNGPEDFKFAPWlVKRCRRL LPE 172 

IIIIIIMlllltllllltlllillltllllillltlll M I +1) 
849503:121 LTKHYQGNNDTDVFSATWNSVMITFGCCGVNGPEDFKFAS— VFRLLTLDSEEVPEACCR 178 

N0V7: 173 -EPQSRDGVLLSREECLLGRSLFLNKQQGCYTVILNTFETYVYLAGALAIGVIAIEVF 228 

ItllMIIIIIIIIIMlltltllll llillllll)ltlli!tlllllllllll + l 

B49S03:179 REPQ3RDGVLL3RBECLLQR3LFLNKQ-GGYTVILNTFETYVYIAGAIAIGVLAXELF 235 
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Table 7E. Alignment of NOV7 with B49509 (SEQ ID NO:71). 



Length m 164 Plus Strand HSPs: 
Score = 835 (293.9 bits). Expect - 1.5e-82, P - 1,5g-82 
Identities - 158/159 (99%), Positives ^ 158/159 (99%), Frame = +2 

N0V7: 1 MEGDCLSCMKYI^WIl^FFiniGGACXIAlGlffVMVDPTGFREIVAANPLLLTG^ 60 

lllllltlI|ll)ill)lll)llllll)MI)lll)llllllllllllllll)lll)IM 

B49509: 1 MEGDCLSCMKmiFVWFFIFLGGACLLAIGIWVMVDPTGFREIVAANPLIJiTGA^ 60 
N0V7: 61 MGGLLFLLGFLGCCGAVRENKCLLLFFFLFlLlIFUyELSAAILAFlFRENLTREFF^ 120 

iiiiiiiiiiitiiiiiiiitiniiiiiiiiiiiiiiiiiiiiiiMiiiitiiiiM 

B49509: 61 MGGLLFLLGFLGCOGAVRENKCLLLFFFLlf'ILIIFIiAELSAAII^FIFREiaLTREPFtKE 120 
N0V7: 121 LTKHYQGNNDTDVFSATWWSVMITFGCCGVHGPEDFKFA 4^1 

iitiiiiiiiiitiiiiiiiiniiiiiiiiiiMDii 

B49509;121 LTKHYQGNNPTDVFSATWNSVMITFGCCGVNGPEDFKFA 159 



Further BLAST analysis produced the significant results listed in Table 7F. The 
disclosed N0V7 protein (SEQ ID NO: 1 8) has good identity with proteins. 



Table 7F. BLAST results for NOV7 


Gene Index/ 
Identifier 


Protein/ 
Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) . 


EXpBC 

t 


Gil 6601561 |gblAAFl90 
31.11AF206661 1 
(AF206661) 


neuronal 
tGtraspanin 
Gallus gallu^ 


247 


128/235 
(54%) 


143/235 
(60%) 


4e-59 


Git 6685175 |gbiAAP238 
28.iiAF220D4 4 1 
(AF220044) 


tetraspanin 
Drosophlla 
melanogaater 


267 


42/185 
(22%) 


71/185 
(37%) 


6e-07 


Gill30974201gbtAAH03 
449.1iAAH03448 
CBC0Q3448) 


Similar to 
tetraspan 1 
Mus musculus 


240 


56/211 
(26%) 


77/211 
(35%) 


6e-06 


Gill0834972iref INP 0 
00551.11 


LEUKOCYTE 
SURFACE ANTIGEN 
CD53 
Homo sapiens 


219 


165/304 
(54%) 


206/304 
(67%) 


6e-05 



This information is presented graphically in the multiple sequence alignment given in 
Table 7G (with N0V7 being shown on line 1) as a ClustalW analysis comparing NOV7 with 
related protein sequences, 
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Table 7G- Informatiott for the ClustalW proteins: 

1) NOV7(SBQIDNO:18) 

2) gi|6601561|£b|AAP19031.1|AF20666Ll (AF20666 1) neuronal tetraspanin[Gallusgallus] (SEQroNO:72) 

3) gi|6685l75|gb|AAF23828.1|AF220044_l (AKZ20044) tetraspanin [Drosophila melanogaster] (SBQIDNO:73) 

4) gi|13097420|8b|AAH03448.1|AAH03448 (BC003448) Similar to tetraspan 1 [Mus musculus] (SEQ ID N0:74) 

5) gill0834972|reflNP_0QQ551Jl CD53 antigen IHomo sapiens] (SEQ ID W:75) 



NOV7 

Oil 6601561 1 
Gil 6685175 1 
Gitl3097420t 
Git 10834972 1 



»0V7 

Si 1 6601561 1 
Gil668B175| 
Gil 13097420 1 
011108349721 



MCfV7 

Gil 6601561 1 
Gil 6685175 1 
Bi 1 13097420 1 
G± 1 10834972 1 



NCfV7 

Si 16601561 1 
81166851751 
Gil 130974201 
Gil 10834972 1 



HOV7 

Gi|660iS61| 
Gi 166851751 
Gil 13097420 I 
Gil 108349721 




liil: 

XiLl 

lEQrTQPQAl: 

AMQffVNVi 

SLTLGNl 



130 



140 




150 

lYOGNN— DTDV: 
IKKHYVRNN—DTHV 
KNFLOTriTSySLGEUVnATl 
'AIEKDYQ-YQTEi! 
ISIHRYH—SDWST] 




190 



200 



210 



220 



230 



240 



.1^ 



PWfflVKRCRR-LLP] 

pawvngkgn-rt; 

RFgKBN— K-Vi 
250 




■QSRDQVLL3 
IQRNVQSRSGMFVKi 
:II.K— DVAKLVPRD] 
-ANPGNHT 




;rslf~ln: 

IDERT— QNl 
•NPSDSN-SFY 
ESKAKS 
SDR— 





260 270 280 

. . . . 1 , . . . U . . . I . . . . I , 

RHDLCHVPLPGHPVEGMA 

;G|q 

kafakyndmrl 

gnKk 

ICQiDKTSQTIGL 



The presence of identifiable domains in N0V7 was determined by searches using 
10 algorithms such as PROSITE, Blocks, Pfam, ProDomain, Prints and then determining ttie 
Interpro number by crossing the domain match (or numbers) using the Interpro website 
(http:www.ebi.ac,uk/interpro/). 

DOMAIN results for N0V7 were collected from the Conserved Domain Database 
(CDD) with Reverse Position Specific BLAST. This BLAST samples domains found in the 
15 Smart and Pfam collections. The results are listed in Table 7H with the statistics and domain 
description, The results indicate that this protein contains the transmembrane -4 domain at the 
positions indicated in Table7H. Residues 10-180 of NOV? are aligned with residues 1-153 of 
the Transmembrane family (SEQ ID NO:76) (E ^ le43). This indicates that the sequence of 
N0V7 has properties similar to those of other proteins known to contain this domain and 
20 sunilar to the properties of this domain. 
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Table 7F. DOMAIN results for NOV7 

gnll Pfam)pfam00335, transmembrane^, Transmembrane 4 family 
CD-Leagth « 226 residues, only 67.7% aligned 

Score 70.1 bits (170), Expect ^ la-13 

10 20 30 40 50 60 



Gnl 1 f f am|p£a9L00335 




70 80 90 100 110 120 



N0V7 CRl 

GnX I Pf au | pf amO 0 335 




130 140 150 160 170 180 

. , . . h . . . i . . . . I . . . - 1 . . . . I . . . . 1 .^1 . . . . . N . . W . - * I . 
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The tetraspanin superfamily includes membrane proteins, such as Lefukocyte surface 
antigen CD37 (OMIM 151523) CD9 (OMIM 143030), CD53 (OMM 151525), CD81 (OMIM 
186845), and the R2 antigen (KAIl; OMM 600623), among o^ers. See also, OMM 300096 
and 300191, describing members of the transmembrane 4 superfamily, which includes 
tetraspanin. Many of these molecules are expressed on leukocytes and have been implicated 
in signal transduction, cell-cell interactions, and cellular activation and development 

CD81 antigen (or TAPAl) is a 26-kD integral membrane protein expressed on many 
human cell types. Antibodies against TAPAl induce homotypic aggregation of cells and can 
inhibit their growth. Oren et aL isolated a cDNA coding for TAPAl . The highly hydrophobic 
TAPAl protein contains four putative transmembrane domains and a potential N- 
myristoylation site. Oren, et al., Molec. Cell. Biol. 10:4007-4015, 1990. TAPAl showed 
strong homology with the CD37 leukocyte antigen (OMM-15 1523) and with the ME491 
melanoma-associated antigen (OMM- 155740), both of which have been implicated in the 
regulation of cell growth, Andria et al cloned the murine homolog of TAPAl from both 
cDNA and genomic DNA Ubraries and demonstrated a very high level of homology between 
human and mouse gaies. Andria et al., J. Immun. 147: 10304036, 1991, See, for example, 
OMM: 186845. 

CD81 is a member of the transmembrane pore integral membrane protein family. It 
has broad tissue distribution, but its function had not been identified Boismenu et ai obtained 
a complete gene from mouse CD81 by RT-PCR. Boismenu et al Science 271: 198-200, 1996. 

A monoclonal antibody specific for mouse CD81 blocked the appearance of alpha-beta 
T cells but not gamma-delta T cells in fetal organ cultures initiated with day 14.5 thymus 
lobes. In re-aggregation cultures with CD81-transfected fibroblasts, CD4-/CD8-thymocytcs 
differentiated iato CD4+/CD8+ T cells. The authors therefore concluded that interaction 
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between immature thymocytes and stromal cells expressing CD81 are required and may be 
sufficient to induce early events associated with T-cell development. 

Chronic hepatitis C virus (HCV) infection occurs in about 3% of the world's 
population and is a major caxise of liver disease, HCV infection is also associated with cryo- 

5 globulinemia, a B lymphocyte proliferative disorder. Virus tropism and the mechanisms of 
cell entry are not completely understood. Pileri et al demonstrated that the HCV envelope 
protein E2 binds human CD81 ^ a tetraspanin expressed on various cell types including 
hepatocytes and B lymphocytes. Pileri, et al.. Science 282: 938^941, 1998. Binding of E2 was 
mapped to the major extracellular loop of CD81 . Recombinant molecules containing titiis loop 

10 bound HCV and antibodies that neutralize HCV infection in vivo inhibited virus binding to 
CD81 in vitro. 

Through eukaryotic expression cloning with an antimetastatic monoclonal antibody 
Testa et al have recently identified a tetraspanin member, PETAJ-3/CD151, as an effector of 
human tumor cell migration and metastasis. Testa, et al., Canccar Res 59:3812-3820, 1999. 

1 5 N0V7 has been analyzed for tissue expression profiles. See Examples. 

As shown in Table 7H, below, this 96 well plate for panel IJ, and its variants are 
composed of RNA/cDNA isolated fi'om various human cell lines that have been established 
from normal and malignant human tissues. Panel 4 contains ceUs and cell lines from normal 
cells and cells related to inflammatory conditions. 

20 The TaqMan oligo set Ag610 for the N0V7 gene includes the forward probe and 

reverse oUgomers. Sequences for the oligos are shown in Table 7G. 



Table 6G: Taqman primers 



Posit 
Ion 


Primttrs 


Sequttnqaa 




Length 


373 


Forward 


5'- GCACTACCAGGGCAATAACGA -3' 


SEQ ID NO: 77 


21 


399 


Pirobe 


FAM-5'- ACGTC^TCTCTGCCACCTGGAACTCG - 


SEQ ID NO: 78 


26 


427 


Reverse 


5'- GCAGCAACCAAATGTGATCATG -3* 


SEQ ID WO: 79 


22 



25 

Taqman results are shown below in Table 7H. 





%Rel. 




%ReL 


Panel 1.1 Tissue Name 


Expn. 


Panel 4D Tissue Name 


Expn. 


Adipose 


1.8 


93768_Sccondary 'nil_anti-CD28/anti-CD3 


0.5 


Adrenal gland 


30.6 


93769_Secondary 'ni2_anti-CD28/anti-CD3 


0,5 


Bladder 


5.5 


93770_Sccondary TVl_anti-CD28/anti-CD3 


0.4 


Brain (amygdala) 


1.7 


93573_Sccondaiy 'nil_restmg day 4-6 in IL-2 


8.3 


Bram (cerebellum) 


85.3 


93572_Secondary Th2_restmg day 4-6 in n^2 


6,8 
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RraiTi rhmnnrsiTrtt^ii*!^ 

r*' 1* 1 ' ' llJLLUlJLFuilXUUUu 1 


8 3 


93^71 Secondarv Tr 1 restinn dav 4-6 in IL-2 


9*1 


Dlalu ^SUUSlouUa- XUj^ra^ 




Q^^fiR TiriTnaTvThl «titi-rD78/anti-CD3 


0*3 


Btdin (thalamus) 


D./ 


M«|£0 TiM-rwai-i/ Tli7 aTiti.J'^TlOft/QTifi f^T^^ 

yjjpy pnmaiy in^^annH,jjzo/ann"i.^i./^ 


a 6 


Ccrebtal Cortex 


2.6 


93570^rmiary iri_ana-uiJ2lyann-vJlJi 


y.-> 


Brain (fetal) 


23.8 


93565_primary *Ilil_re3ting dy 4-6 in 11^2 


52.7 


Bxain (whole) 


6.9 


93566_pnmary Th2j:estmg dy 4-6 m IL-2 


15.7 


CNSca. (glio/astro)U-118- 








MKJ 


n fi 
U.U 


? J jo /__pnmary iri^resnng ay *i-o in ul*-* 




CNS ca. (astro) SF-539 


0*8 


93351_CD45RA CD4 lyiD|)hocyte_anti-CD28/anti-CD3 


O.o 


CNS ca. (astro) SNB-75 


1.2 


93352_CD45RO CD4 lymphocyte_anti-CD28/anti-CD3 


1.6 


CNS ca. (astro)SW1783 


2J 


9325 1_CD8 Lymphocytes_anti-CD28/anti-CD3 

93353 chronic CDS Lymphocytes 2ry_resting dy 4-6 in IL- 


0.2 


CNS ca. (glio)U251 


0.0 


2 


2.0 




93574 chronic CD 8 Lymphocytes 2ry_activated 




CNS ca. (glio)SF-295 


9.0 


CD3/CD28 


0.4 


CNS ca. (glio)SNB-19 


0.1 


93354_CD4_none 


9.4 


CNS ca. (glio/astro) U87- 








MG 


0*0 


93252_Sccondary Tlil/Th2/Trl_anti-CD95 CHll 


13*7 


CNS ca.* (neuro; met ) SK- 








N-AS 


49.7 


33 1 03_LAK ceLi3_restmg 


z.u 


Mammary gland 


9.7 


93788_LAK cells_II^2 i 


0.7 


Breast ca. BT-549 


0.0 


93787_LAK cells_IL-24-IL-12 


L4 


Breast ca. MDA-N 


0.0 


93789_LAK cells_lL-2+IFN gamma 


2.5 


Breast ca,* (pi e^sion) 








r47D 


0.0 


93790_LAK ccils_IL-2+ IL-IS 


1.4 


Breast ca,* (pL effusion) 








MCF-7 


0.0 


93 1 04_LAK cellsPMA/ionomycm and IL- 1 S 


0.5 


Bieast ca* (pLei) MDA- 












□QC70 'hXtr r'jfclle TT 9 -me-Hna 

7jj / 0 IN^ v^cns uu-^^resong 




Small intcstmc 


1 /♦O 


!:r J 1 Up jocLixea jbympnocytc i\*acuoii_i wo w ay lyii^t^ 


1 9 


uoiorecuu 




7ji iu ivLixea ijynipnocyxe ixjcoouon^iwo way iVLi^iN. 


ft ft 


i^uion ca. xiiZ)f 


A n 

U.v 


y J 111 iVLiAco L/ynipxiocyie luJacuon^^ i wo w ny xvjujXv 


0.2 


colon ca^\Jauo-z 


3.*l 


7jii^ ivionQnuciciir oiviv^ / rcsinig 




Colon ca-HCT-15 


0.0 


9d 1 13_Mononuclear \Jells (rxiMus)Jr WM 




Colon caJiCT-116 


4.7 


93 1 14_Mononuclcar Cells (PBMCs)_PHA-L 


3.6 


Colon ca.HCC-299S 


0*0 


93249„Ramos (B ccll)_noiie 


0.0 


Colon ca. SW480 


0,0 


93250_Ramo3 (B cell)Jonomycin 


0.0 


Colon ca,*(SW480 








met)SW620 


0.0 


93349_B lynq)nocytes_PWM 


2.1 


Stomach 


9*9 


93350_B lymphoytes_CD40L andE^ 


0.7 


Gagtnc ca,* (liver met) NCI- 








INo / 


ft ft 




0.0 




100.0 


9324s EOL-1 rRosinonh-il^ dhiiAMP/PMAionomvcin 


0.0 


FCUtX OA.CiCUll 


27.4 


93356 Dendritic Cells fiotie 


0.0 


i3&CiCuU IllUaljlC 


16.6 


93355 Dendritic Cells LPSlOOnB/ml 


0.1 


cnuouicuai ccjus 


S4 7 




0*0 


cmuoxncnai ecus ^ucavca^ 


J. 1 




0.3 


fCidney 


4'^ 8 
*tJ*o 


93776 MrmncvteQ T PS Sft TKr/ml 


0.0 




12 3 




0.4 


Donal i^o n^i^Ci 

tsJSuSii Ca, AOO*VJ 


ft ft 


Q'^SRO MnrrAnYiflCTpg TPCllftftna/ml 


0.0 




ft 1 


O'^OOR HTTVPr' rFnHnthftlinl'^ nnne 

7iJV/7p X±.U Y xj\^ 1 J^ilUUlXlClUllf llUJLlw 


25.0 


tvOnal Ca. /Vl^oTliN 




vj \/77_n u V J3\^ ^x^utiiJ uicLuu UU V vV» 


70.2 


Renal ca TK- 1 0 


12.0 


93100 HUVEC (Endothelial) IL-lb 


24.4 


Elcnalca.UO-31 


8.0 


93779_HUVEC {Endothelial)_IFN gamma 


36.6 


Renal ca. RXF 393 


5.2 


93102_HUVEC (Endothelial)_TNF alpha + IFN gamma 


6.6 


Liver 


8.5 


9310rHUVEC (EndotheliaOlTNF alpha -h IL4 


4.8 


Liver (fetal) 


3.7 


93781 HUVEC (Endottielial) IL-11 


32.6 
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Liver ca, (hcpatoblast) 




o5o3_Lung Microvascular bnaotneuai v^ensr^nonc 


89.1 


HepG2 


0.0 5 




)3584_Lung Microvascular Enaotnelial ceiis_iJNra 




Lung 


9,2 I 


ig/mlj ana iLlb (^i ng/nu; 


38.0 


Lung (fetal) 


13.0 ! 


)2662_Microvasculax Dermal endothelium_none 


1 AA A 


LuBg ca (non-s.cell) HOP- 




?2663 Microsvasular Dermal endothclium^TNFa (4 ng/ml) 


45.0 


62 


15.3 i 


mdILlb(lng/ml) 


Lung ca. (large cell)NCI- 




)3773_Bronchial epiAelium^TNFa (4 ng/ml) and ILlb (1 


A A 


BI460 


0.1 1 


ag/ml) ** 


Lung ca» (non-s.cell) NCI- 






0.1 


H23 


1.3 


)3347_Small Airway Epithelium_none 


Lung ca. (non-s.cl) NCI- 




?334S_Small Airway Epithelium_TNFa (4 ng/ml) and ILlb 


A 1 
0.1 


H522 


4.6 


[1 ng/ml) 


Lung ca. (non-sin. cell) 






^ 0.2 


A.549 


0.3 


92668_Coronery Aitcry SMC_resting 


Lung ca. (sxcll var,) SHP- 




32669 Coronery Artery SMCJ^rNra (4 ngoni; ana ibin\i 


0.0 


77 


U*U 


4g/mij 


Lung ca. (small ceU)LX-l 


0.0 


93 107_astrocytcs_resting 


1L7 


Lung ca. (small cell) NO- 




7.?iUo astrocyres_irNra ng/nu^ aou xl^iu iig/xiu^ 


9.0 


H69 


A yl 
0.4 


Lung ca. (squam.)SW 900 


A 1 


y iooo__^ u^5 i z ^^rjasopnu j_rraimg 


t Q 

A 


i^UlLg Co. ^SlJUalU.^ i^VJL" 




92667jnj-812 (Basophil^PMAMmycin 


4.6 


H596 


0.6 


LjyJJLlL'll UvM-v 


4,6 


93 579 CCD 1 1 06 (Keratinocytcs)_nonc 


0.0 




3.3 


93580_CCD1106 CKeratinocytes)_TNFa and IFNg *♦ 


0,0 




1.0 


93791 Liver Cirrhosis 


L4 




12.1 


93792 Lupus Kidney 


3.0 




1.6 


93577~NCI-H292 


1.1 


Uvanan ca, UVCAK- j 




O'^'l^ft MPr fl007 TT -4 


1.9 


Ovarian ca. OVCAR-4 


0.5 


933o0_NCI-H292_lL-y 


1 7 
i. / 


Ovanan ca. OVCaR-5 


2,5 




1 1 
1*1 


Ovarian ca.OVCAR-8 


0,0 


93357_NCI-H292_IFN gamma- 


1.4 


Ovanan ca.* (ascites) SK- 






21.4 


OVo 


B s 
o#o 




Pancreas 


0.3 


W / /o_rlrAiiU_^Uv-l Dew llNA aipua 


8.9 


Pancreatic ca. CAP AN 2 




wzp^^JNormai Jtiuman i^ung rmrooiasvjaonc 

93253 Normal riuman JLuag riuroDiasi^iJNra \** ng^niij 


0*0 


Pituitary gland 


0.5 


ana IL-Io {} ng/mi; 


0.0 


Plancenta 


ICQ 


Q^OC? "hJj LP mill XXi 1 Tvt n T imn ^'iWm^laa'f' TT A 

yj z D / JN ormai jtiuman L>ung jr ? pro diss t_iir-H 


0.1 


Prostate 


4.8 


y325o Normal Human i^ung r iDroDiast_iL-y 


n 1 


Prostate ca.' (^oone metifL-- 










00 


93255 Normal Human LunK Fibroblast IL-13 


0.2 


Sftlftvarv ^land 


4.1 


93258_Normal Human Lung Fibroblast_IFN gamma 


0.0 




2.9 


93106 Dermal Fibroblasts CCD 1070_resting 


2.1 


.^ninal cord 


7.2 


93361 Dermal Fibroblasts CCD1070_TNF alpha 4 ng/nd 


10.5 




4.1 


93105 Etermal Fibroblasts CCD1070_IL-I beta I ng/ml 


0.7 


1 11 y 1 viu 


10.1 


93772 derroal JBbroblast DFN gamma 


0.1 




11.1 


93771 dcrxoal fibroblast IL4 


0.2 


Melanoma M 1 4 


ft n 




8.1 


Melanoma lkj/l imvi 


A A 


7JZOU LDli' Vl>OJUU5 Z 


1.5 


Melanoma ual^L'-dz 


A A 


7JZQi xDxJ I^LUUlia 


2.5 


Melanoma SK-MEL-28 


0.0 


735010_Colon_nonnal 


26.9 


Melanoma* (met) SK-MEL- 






62.1 


5 


2.0 


735019_Lung^none 


Melanoma Hs688(A).T 


10.1 


64028-l_Thymus_nonc 


41.9 


Melanoma* (met) 






3.4 


Hs688(B)T 


3.7 


64030-1 Kidney none 
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In Table 6J the following abbf eviations arc used: ca. = carcinoma, * = established from metastasis, met " metastasi$, S cell 
var ^ small cell variant^ noti-s - non-sm ^non-small, squam = squamous, pi. eff - pi effusion = pleural effusion, glio « ^ ^ 
gjionift, astro = astrocytoma, and neuio - neuroblastoma. 

5 The data fix)m panel 1.1 mdiQate that expression of Ag610 is primarily in normal 

tissues including the kidney, endothelial cells, heart, brain, skeletal muscle, and the adrenal 
gland. The only tumor which highly expresses Ag6lO is mel SK_N_AS. 

The data jBrom panel 4D indicate that the Ag6l0 trmiscript is highly expressed in 
resting primary and secondary T cells, but expression is ahnost absent in activated cells. This 

10 is particularly striking in primary Thl cells where there is a greater than 50 fold difference in 
transcript levels between primary activated Thl cells and primary resting Thl cells. The only 
activated T cell populations that expresses this antigen are Thl/Trl/Th2 cells activated in the 
presence of anti"CD95, an antibody which blocks FasL-mediated apoptosis. Normal colon 
also highly expresses this transcript, but expression of this transcript is reduced 3-10 fold in 

15 colon tissue jBrom patients with IBD or Crohn's disease. Untreated HUVEC, and lung 

microvascular endothelial cells also highly express this transcript that is down regulated after 
activation in these tissues. The expression of this molecule suggests that it is down regulated 
in response to inflamniation. 

In some embodiments, a protein therapeutic derived from N0V7 prevents the 

20 activation of Thl , Th2, and Trl cells, thereby reducmg or inhibiting inflammation in chronic 
autoimmune diseases mediated by activated T cells such as asthma, arthritis, psoriasis, and 
inflammatory bowel disease. The applicability of this molecule in inflammatory bowel 
disease (IBD) is further suggested by the absence of this transcript in tissue from pati^ts with 
Crohn's disease and colitis. VanCompemoUe et al., Eur J Immunol 31:823-31, 2001; 

25 KitadokoroetaU EMBO 120:12-8,2001, 

The similarity information for the N0V7 protein and nucleic acid disclosed herein 
suggest that N0V7 may have important structural and/or physiological functions characteristic 
of the 4 transmembrane family. Therefore, the nucleic acids and proteins of the invention are 
useful in potential diagnostic and ther^eutic applications and as a research tool. These 

30 include serving as a specific or selective nucleic acid or protein diagnostic and/or prognostic 
marker, wherein the presence or amount of tihe nucleic acid or the protein are to be assessed, as 
well as potential therapeutic apphcations such as the following: (i) a protem therapeutic, (ii) a 
small molecule drug target, (iii) an antibody target (therapeutic, dia^ostic, drug 
targeting/cytotoxic antibody), (iv) a nucleic acid useful m gene therapy (gene delivery/gene 

35 ablation), and (v) a composition promoting tissue regeneration in vitro and in vivo (vi) 

biological defense weapon. The novel nucleic acid ©acoding N0V7, and the disclosed N0V7 
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protein, or fragments thereof, may further be useful in diagnostic applications, wherein the 
presence or amount of the nucleic acid or the protein are to be assessed. 

The nucleic acids and proteins of the invention are useful in potential thers^jeutic 
applications implicated in HCV infection, Burkitt Lymphoma, and metastatic tumors, 
5 immunological disorders particularly those involving T-cells, and/or other pathologies and 
disorders. For example, a cDNA encoding the tetraspanin-Uke protein may be useful in gene 
therapy, and the tetraspanin-hke protein may be useful when administered to a subject in need 
thereof. By way of nonlimiting example, the NOV7 compositions will have efficacy for 
treatment of patients suffering from HCV infection, Burkitt LymphomB metastatic tumors and 

10 immunological disorders particularly those involviug T-cells. The novel nucleic acid 
encoding tetraspanin-like protein, and the tetraspanin-Uke protein of the invention, or 
fragments thcareof, may further be useful m diagnostic applications, wherein the presence or 
amoxxnt of the nucleic acid or the proteiu are to be assessed. j 

The disclosed NOV7 polypeptides can be used as immunogens to produce vaccines. 

1 5 The novel nucleic acid encoding NOV-hke protein, and the NOV-like protein of the mvention, 
or fragments thereof, may further be useful in diagnostic applications, wherein the presence or 
amount of the nucleic acid or the protem are to be assessed. These mataials are further useful 
in the generation of antibodies that bind immunospecifically to the novel substances of the 
invention for use in therapeutic or diagnostic methods. For example the disclosed N0V7 

20 protehi has multiple hydrophilic regions, each of which can be used as an immunogen. In one 
embodiment, a contemplated NOV? epitope is from about anuno acids 110 to 140. In another 
embodiment, a NOV7 epitope is from about ammo acids 155 to 180, In additional 
embodimrats, N0V7 epitopes are from ammo acids 190 to 200. These novel proteins can also 
be used to develop assay system for functional analysis. These antibodies may be generated 

25 according to methods known in the art, usiug prediction from hydrophobicity charts, as 
described in the "Anti-NOVX Antibodies" section below. 

N0V8 



N0V8a 

N0V8a was initially identified by searching CuraGen's Human SeqCalling database 

30 for DNA sequences which translate into proteins with similarity to a protein family of interest 

SeqCalling assembly 27479850_EXT1 was identified as having suitable similarity, 

SeqCalling assembly 27479850_EXT1 has one component. In a search of CuraGen's human 

expressed sequence assembly database, assembly s3aq: 27479850 (507 nucleotides) was 
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identified as having identical homology to this predicted gene sequence. This sequence is 
derived fi:om a publicly available Homosdpiem expressed sequence tag (EST) incoiporated 
into the CuraGen database. This database is composed of the expressed sequences (as derived 
from isolated niElNA) from more than 96 different tissues. The mKNA is converted to cDNA 

5 and then sequenced* These expressed DNA sequences are then pooled in a database and those 
exhibiting a defined level of homology are combined into a single assembly with a common 
consensus sequence. The consensus sequence is representative of all membcar components. 
Since the nucleic acid of the described invention has identical sequence identity with the 
CuraGen assembly, the nucleic acid of the invention represents an expressed gene sequence. 

10 SeqCalling assembly 27479850__EXT1 was analyzed further to identify open reading 

frame(s) encoding for a novel full length protem(s) and novel splice forms of these SHDs. 
This was done by extending the SeqCalUng assembly using suitable additional SeqCalling 
assembUes, publicly available EST sequences as well pubUc genomic sequence. Public ESTs 
and additional CuraGen SeqCalling assembhes were identified by the CuraTools program 

1 5 SeqExtend^". Hiey w^e included in ttie DNA sequence extension for SeqCalhng assembly 
27479850_EXT1 only when sufficient identical overiap was found. These inclusions are 
described below: 

Genomic clone AC008616 was identified as having regions with 100% identity to the 
SeqCalling assembly 27479850_EXT1 and was selected for analysis because this identity 
20 implied that this clone contained the sequence of the genomic locus for SeqCalling assembly 
27479850_EXT1. 

The genomic clone was analyzed by Genscan and Grail to identify exons and putative 
coding sequences/open reading frames. This clone was also analyzed by TblastN, BlastX, and 
other homology programs to identify regions translating to proteins with similarity to the 

25 originalprotein/proteinfamily of interest 

The results of these analyses were integrated and manually corrected for ^parent 
inconsistencies, thereby obtaining the sequences encoding the full-length protein. When 
necessary, the process to identify and analyze cDNAs/ESTs and genomic clones was reiterated 
to derive the full length sequence. This invention describes this fulHength DNA sequence and 

30 the full-length protein sequence which it encodes. These nucleic acids and protein sequences 
for each spUce fonn are referred to here as 27479850_EXT1, or N0V8a. 
Specifically, CuraGen's SeqCalling Assembly 27479850_EXT1 is made up of one 507 bp 
fragment. SeqCalUng Assembly 27479850_EXT1 lists lung, t^tis and B-ceU as tissue 
sources. Literature sources mentioned above cite brain and the central nervous system as 
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tissue sources for SHD and SHD-like protems. SeqCalling assembly 2411 1358_EXT1 
showed initial homology, by searching with BLASTX to a M.musculus (Mouse) protein: SHD 
PROTEIN (SPTREMBL-ACC:088834; 343 aa). Using BlastN, this SeqCalling Assembly 
was identical at the nucleotide level to a GenBank genomic sequence: Homo sapiens 
5 chromosome 19 clone CIT978SKB_144D21, 49 unordered pieces - 1 12626 base pairs 
(bp)(GENBANKNEW-ID: AC008616|acc:AC008616). AC008616 was processed with 
GenScan™ and the predicted coding regions were analyzed using BlasfX, BlasfN and TBlasfN 
to find exons witti homologies to M .musculus SHD PROTEIN. The genomic clone matched 
identically to the SeqCalling Assembly 27479850_EXT1 . AC008616 was used to extend 

10 27479850_EXT1 . This was accomplished by using the protein sequence of 088834 and 

Cuiatool's TblastN against the GBNEW database. Intron/exon junctions were determined by 
manual inspection and corrected for apparent inconsistencies. BlastX of this sequence showed 
the correct full-length protein, 27479850_EXT1 . The base pair (bp) regions used from the 
genomic clone were: 67447-67770, 70280-70357, 70436-70624, 72160-72288, 75627-75746, 

15 77831-78016, The disclosed NOV-8 is expressed in at least the following tissues: brain and 
central nervous system derived from literature sources and lung, testis and B-cell derived fix)m 
27479850„EXT1. 

The novel nucleic acid was identified on chromosome 19. The disclosed N0V8a ^ 
nucleic acid of 1026 nucleotides (also referred to as 27479850_EXT1) is shown in Table 8A. 
20 An open reading begins with an ATG initiation codon at nucleotides 1-3 and ends wifli a TGA 
codon at nucleotides 1024-1026. . 



Table 8A. N0V8a Nucleotide Sequence (SEQ ID NO:19) 

ACTACACCGAGAGCGACATCCTGAGGGCCTACCGCGCGCAGAAGAACCTGGACTTCGAGG^^ 

GGAGGCGGAGAGCCGCTTGGAGCCGGACCCCGCGGGCCCTGGGGACTCCAAGAACCCCGGAGATGCCAAG 

TATGGTTCTCCCAAGCACCGGCTCATCAAGGTGGAGGC*rGCGGA'rATGGCCAGAGCCAAGGCCCTTCTGG 

GCGGCCCCGGGGAGGAGGTGCGTGGCTGGGTGGCCTGGGGAGAGCGCTTTGATGCTCAGCCTCATCCTGC 

ACCCCCGGATGATGGGTACATGGAGCCCTACGATGCCGAATGGGTCATGAGTGAACTTCCCGGCAGAGGG 

GTGCAGCTCTATGACACCCCTTATGAGGAACAGGACCCAGAGACAGCAGATGGACCCCCTTCTGGGCAGA 

AGCCTCGGCAGAGCCGGATGCCCCAGGAAGATGAACGGCCAGCAGATGAGTATGATCAGCCCTGGGAGTG 

GAAGAAAGACCACATCTCCAGGGCGTTTGCACCAGTGCAGTTTGACAGTCCAGAGTGGGAGAG6ACTCCA 

GGCTGAGCCAAGGAGCTCCGGAGACCTCCGCCCAGAAGCGCCCAGCCTGCGGAGCGTGTGGACCCAGCCG 

TGCCCCTGGAGAAACAGCCGTGGTTTCATGGCGCCCTGAACAGGGCGGATGCAGAGiAGCCTCCTGTCCCT 

CTGCAAGGAAGGGAGCTACCTAGTGCGGCTCAGTGAGACCAACCCCCAGGACTGCTCCT 

AGGAGCCAGGGCTTCCTGCATCTGAAGTTCGCGCGGACCCGTGAGAACCAGGTGGTGCTGGGCCAACACA 

GGGGGCCCTTCCCGAGCGTGCCCGAGCTCGTCGTCCACTACAGTTCACGCCCACTGCCGGTGCAGGGTGC 

CGAGCATCTGGCTCTGCTGTACCCCGTGGTGACGCAGACCCCCTGA 



The disclosed nucleic acid sequence has 299 of 360 bases (83%) identical to a 1529 bp 
25 Mus musculus src homology domain (SHD) mRNA, (GENBANK-ID:AB01 8423) (E value = 
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The N0V8a protein encoded by SEQ ID NO: 19 has 341 amino acid residues, and is 
presented using the one-lett^ code in Table 8B (SEQ ID NO:20). The SignalP, Psort and/or 
Hydropathy profile for N0V8 predict that N0V8 has a signal peptide and is likely to be 
localized in the cytoplasm with a certainty of 0.5050. 

5 





Table 8B. Encoded NOV8a protein sequence (SEQ ID NO:20). 


MAKVJLRDYLSFGGRRPPPQPPTPDYTESDILRAYRAQKNLDFEDPYEDAESRLSPDPAGPGDSKNPGDA 
KYGSPKHRLIKVEAADMARAKALLGGPGEEVRGWVAWGDPFDAQPHPAPPDDGYMEPYDAQWVMSELPG 
RGVQLYDTPYEEQDPETADGPPSGQKPRQSRMPQEDERPADEYDQPWEWKKDHISRAFAPVQFDSPWE 
RTPGSAKELRRPPPRSPQPAERVDPALPLEKQPWFHGPLNRADAESLLSLCKEGSYLVRLSBTNPQcbs 
LSLRSSQGFLHLKFARTRENQWLGQHSGPFPSVPELVLHYSSRPLPVQGAEHIiALLYPWTQTP 





The full amino acid sequence of the protein of the invention was found to have 257 of 
338 amino acid residues (76%) identical to, and 275 of 338 residues (81%) positive with, the 
343 amino acid residue SHD protein from Mi4s musculus (ptnr:SpTREMBL-ACC:088834) 
10 (E value = 5,5e^^^^ 

NOV8b 

N0V8b 

The sequence of Acc. No. CG5 1761-02 (N0V8b) was derived by laboratory cloning of 

15 cDNA fragments, by in silico prediction of the sequence, and refining the information 

obtained for NOVSa. cDNA fragments covering either the full length of the DNA sequence, 
or part of the sequence, or both, were cloned- In silico prediction was based on sequences 
available in CuraGen's proprietary sequence databases or in the pubUc human sequence 
databases, and provided either the fiiU length DNA sequence, or some portion thereof. The 

20 laboratory cloning was performed using one or more of the methods summarized below: 
SeqCalling™ Technology: cDNA was derived from various human samples 
representing multiple tissue types, normal and diseased states, physiological states, and 
developmental states from different donors. Samples were obtained as whole tissue, primary 
cells or tissue cultured primary cells or cell Unes. Cells and cell lines may have been treated 

25 Avith biological or ch^cal agents that regulate gene expression, for example, growth factors, 
chemokines or steroids. The cDNA thus derived was then sequenced using CuraGen'S 
proprietary SeqCalling technology. Sequence traces were evaluated manually and edited for 
corrections if appropriate. cDNA sequences from all samples were assembled together, 
sometimes including public human sequences, using bioinformatic programs to produce a 

30 consensus sequence for each assembly. Each assembly is included in CiuraGen Corporation's 
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database. Sequences were included as components for assembly when the extent of identity 
with another component was at least 95% over 50 bp. Each assembly rq>resents a gene or 
portion thereof and includes information on variants, such as spUce forms single nucleotide 
polymorphisms (SNPs), insertions, deletions and oflier sequence variations. 

RACE; Techniques based on the polymerase chain reaction such as rapid amplification 
of cDNA ends (RACE), were used to isolate or complete the predicted sequence of the cDNA 
of the invention. Usually multiple clones were sequenced fi:om one or more human samples to 
derive the sequences for fragments* The following human samples from different donors were 
used adrenal gland, bone marrow, brain - amygdala, brain - cerebellum, brain - hippocampus, 
brain - substantia nigra, braia - thalamus, brain -whole, fetal brain, fetal kidney, fetal liver, 
fetal lung, heart, kidney, lymphoma - Raji, manamary gland, pancreas, pituitary gland, 
placenta, prostate, sahvary gland, skeletal muscle, small intestine, spinal cord, spleen, 
stomach, testis, thyroid, trachea and uterus for the RACE reaction. The sequences derived 
from these procedures were included in the SeqCalling Assembly process described in the 
precediQg paragraph. 

Multiple clones were sequenced and these fragments were assembled together, 
sometimes including public human sequences, using bioinfoxmatic programs to produce a 
consensus sequence for each assembly. Each assembly is included in CuraGen Coiporation's 
database. Sequences were included as components for assembly when the extent of identity 
with another component was at least 95% over 50 bp. Each assembly represents a geae or 
portion thereof and includes mformation on variants, such as splice forms single nucleotide 
polymorphisms (SNPs), insertions, deletions and other sequence variations. 

The DNA sequence and protein sequence for a novel SHD protem-like gene were 
obtained by exon linking and extended by RACE and are reported here as CuraGen Acc. No. 
CG51761-02,orNOV8b. 

The disclosed N0V8 gene is expressed in, for example, the following tissues: adrenal 
gland, bone marrow, brain - amygdala, brain - cerebellum, brain - hippocampus, brain - 
substantia nigra, brain - thalamus, brain -whole, fetal brain, fetal kidney, fetal liver, fetal lung, 
heart, kidney, lymphoma - Raji, mammary gland, pancreas, pituitary gland, placenta, prostate, 
salivary gland, skeletal muscle, small intestine, spinal cord, spleen, stomach, testis, thyroid, 
trachea and uterus. This expression information was derived from the tissue sources of the 
sequences that were mcluded in the derivation of the sequence of N0V8. 

The 1223 bp nucleic acid for N0V8b (SEQ ID N0:21) is shown in Table 8C. An open 
reading fi^e was identified begiiming at nucleotides 101-103 and ending at nucleotides 
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1 124-1 126. The start (ATG) and stop (TAG) codons of the open reading frame are highlighted 
in bold type. Putative untranslated regions are underlined. NOVSb differs from NOV8a by 
having a 100 bp 5' UTR and a 97 bp 3' UTR, Additionally, there are 20 nucleotide 
differences, all located between nucleotides 247 and 420 (numb^ed with respect to NOVSb). 



Table SC. NOV8b Nucleotide Sequence (SEQ ID NO:21) 



CTTCCTCTCCACCTCCTCCTCCTCCTTGGGGAAAGGGGCCCGGfiGARGGGCATGTGGGGG 60 

CCCCTCTGACAGTGGCCCGATTGGGGTGACAGGCGCCCAAA TC^CCAAGTGGCTACGGGA 120 

CTACCTGAGCTTTGGGGGTCGGAGGCCCCCTCCGCAGCCGCCCACCCCGGACTACACCGA 180 

GAGCGAGATCCTGAGGGCCTACCGCGCGCAGAAGAACCTGGACTTCGAGGACCCCTATGA 240 

GGACGCCGAGAGCCGCTTGGAGCCGGACCCCGCGGGGCCTGGGGACTCCAAGAACCCCGG 300 

AGATGCCAAGTATGGTTCTCCCAAACACCGGCTCATCAAGGTGGAGGCTGCGGATATGGC 360 

CAGAGCCAA6ACCCTTCTGGQCGGCCCCGGGGAGGAGCTGGAAGCCGACACTGAGTATTT 420 

AGACCCCTTTGATGCTCAGCCTGATCCTGCACCGCCGGATGATGGGTACATGGAGCCCTA 480 

CGATGCCCAATGGGTCATGAGTGAACTTCCCGGCAGAGGGGTGCAGCTCTATGACACCCC 540 

TTATGAGGAACAGGACCCAGAGACAGCAGATGGACCCCCTTCTGGGCAGAAGCCTCGGCA 600 

GAGCCGGATGCCCCAGGAAGATGAACGGCCAGCAGATGAGTATGATCAGCCCTGGGAGTG 660 

GAAGAAAGACCACATCTCCAGGGCGTTTGCACCAGTGCAGTTTGACAGTCCAQAG'PGGGA 720 

GAG(^TCCAGGCTCAGCCAAGGAGCTCCGGAGACCTCGGCCCAGAAGCCCC 780 

GGAGCGTGTGGACCCAGCCCTGCCCCTGGAGAAACAGCCGTGGTTTCATGGCCCCCTGAA 840 

CAGGGCG6ATGCAGAGAGCCTCCT6TCCCTCTGCAAGGAAGGCAGCTACCTAGTGCGGCT 900 

CAGTGAGACCAACCCCCAGGACTGCTCCl^GTCTCTCAGGAGCAGGCAGGGCTTCCTGCA 960 

TCTGAAGTTCGCGCGGACCCGTGAGAACCAGGTGGTGCTGGGCCAACACAGCGGGCCCTT 1020 

CCCCAGCGTGCCCGAGCTCGTCCTCCACTACAGTTCACGCCCACT6CCGGTGCAGGGTGC 1080 

CGAGCATCTGGCTCTGCTGTACCCCGTGGTCACGCAGACCCCCTG ACAGTGACCCTCGGC 1140 

CCCCTTTTGAGTCCTCGGGCCCAGAATCGTATCCCAAAGCCCTCCCATGGCCTAGAAAAT 1200 

AAATAAGTTATTGTTTGTCTTAG 1223 



The disclosed nucleic acid sequence has 309 of 377 bases (81%) identical to a 152^ bp 
Mw5 mM5cw/wj src honiology domain (SHD) mRNA, (GENBANK^ID:AB01 8423) (E value = 
3.0e-^^% 

10 The NOVSb protein encoded by SEQ ID N0:21 has 341 amino acid residues, and is 

presented using the one-letter code in Table 8D (SEQ ID NO:22). The SignaLP, Psort and/or 
Hydropathy profile for N0V8 predict that N0V8 has a signal peptide and is likely to be 
locaUzed in the cytoplasm with a certainty of 0.5050. NOVSb differs from N0V8a at 9 
positions: T91 >A; LlOO >V; ElOl >R; A102 >G; D103 >W; T104 >V; BIOS >A; Y106 >W 

15 andL107>G. 



Table 8D. Encoded NOV8a protein sequence (SEQ ED NO:22), 



MAKWLRDYLSFGGRRPPPQPPTPDYTESDItiRAYRAQKNLDFEDPYEDAESRLEPDPAGP 60 
GDSKNPGDAKYGSPKHRLIKVEAADMARAKTLLGGPGEELEADTEYLDPFDAQPHPAPPD 120 
DGYMEPYDAQWVMSELPGRGVQLYDTPYEEQDPETADGPPSGQKPRQ3RMPQEDERPADE 180 
YDQPWEMKKDHISRAFAPVQFDSFEWERTPGSAKELRRPPPRSPQPAERVDPALPLEKQP 240 
WFHGPLKRADAESLLSLCKEGSYIiVRLSETNPQDCSLSLRSSQGFLHLKFARTRENQWli 300 
GQH3GPFP3VFELVLHYSSRPLPVQGAEHLALLYPWTQTP 341 
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The full amino acid sequence of the protein of the invention was found to have 261 of 
338 amino acid residues (77%) identical to, and 279 of 338 residues (82%) positive with, tiae 
343 amino acid residue SHD protein from Mus musculus (ptnr:SPTREMBL"ACC:088834) 
(E valuer 4.3 e-^^*^). 
5 Patp results include thoselisted in Table 8 C, 



Table 8£. Patp aUgnments of NOVS 



Sequencea producing High-scoring Segment Pairs: Smallest 

Reading High Prob 
Frame Score P(N) 

patptY07040 Breast cancer associated antigen precursor*.* +1 521 4,4e-64 

patp :B54 255 Human pancreatic cancer antigen protein se,.. +X 347 1.7e-33 

patp:R37746 Collagen-like polymer DCP5 encoded by clon. . . -3 166 1.6e-08 

patp:R93257 Collagen-like polymer aequence D gene 5 pQ«t« -3 166 1. 6e-08 

Further BLAST analysis produced the significant results listed in Table 8F. The 
disclosed NOVS protein has good identity with a number of src domain-containing proteins. 

10 



Table 8F. BLAST resulto for N0V8 


Gene Index/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
(%} 


Positives 
(t) 


Expect 


gil66779391ref INP 03 
3194.11 {AB018423) 


src homology 2 
domain- containing 
trans forming 
protein D Mu3 
musculus 


343 


248/338 
(73%) 


266/338 
(78%) 


e-114 


gi 1 9369520 1 emb I CAB98 
202,11 (AL390078) 


similar to 
(HP_033194.X) src 
homology 2 
Homo sapiQiis 


247 


238/255 
(93%) 


241/255 
(94%) 


e^l06 


gi|545100|gb|AAB2978 
0.11 


Shb=Src homology 
2 protein [mice, 
Peptide Partial] 


309 


126/279 
(45%) 


176/279 
(62%) 


3e^50 


gil4506935Iref INP 00 
3019,11 (X75342) 


SHB adaptor 
protein (a Src 

homology 2 
protein) Homo 
sapiens 


596 


142/339 
(41%) 


189/339 
(54%) 


4e-48 



This information is presented gr^hically in the multiple sequence alignment given in 
Table 8G (with N0V8a being shown on line 1) as a ClustalW analysis comparing NOVS with 
related protein sequences. 

15 
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Table 8F, Information for the ClustalW proteins: 

1) NOVSa (SEQ ID NO:20) 

2) NOVSb (SEQIDlSrO:22) 

3) gi|6677939|reflNP_033 194. 1| src homology 2 domain-containing transforming protein D [Mub musculus] (SEQ ID 
NO:80) 

4) gi|9368520|emb|CAB98202.1| (AL390078) similar to (NP„033194.1) homology 2 domain-containing 
transforming protein D [Mus musculus] (SEQIDN0:81) 

5) gi|545100|gb|AAB29780. 1| Shb=Src homology 2 protein [mice, Peptide Partial. 309 aa] (SEQ ID NO:82) 

6) gi|4506935|rcfINP__003019.1| SHB adaptor protein (a Src homology 2 protein) [Homo sapiens] (SEQ ID NO;83) 
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N0V8 
NCVBb 

gril 66779391 
grl|93SB520| 
gi 1 545100 1 
gl I 4506935 1 



Nova 

gi) €6779391 
gi I 93685201 
gl I 545100 1 
gl 1 45069351 



30 



40 



50 



60 
.1 



MKIUUjCGRi:iPSLGGAERR£VL0h6RSQRAAGRRRRRQELSLGVGSGRPGGPPPGPGRRG 
70 aO 90 100 110 120 




TCAAALPPEWPEIRRTGLPREGPRPP; 



130 
..I.. 



140 



150 



160 



!RRR 



180 



HOVS 

N0V8b 

gi 16677939 1 

gll 93685201 — — ^ 

gi 1 545100 1 

gi 1 4506935 1 GERPSQPPQ?VVPQASSAA5ASCGPATA5CFSAS5G5LPDD5GST5{ 




190 



KOVB 
K0V8b 

gll 66779391 
gi 1 93685201 
gll 545100) 
gll 4506935 1 



N0V8 
NOVSb 

gi 166779391 
gll 9368520 1 
gi 1 545100 1 
git 450 6935 1 



K0V8 
NOVBb 

gi I 66779391 
git 93685201 
git 545100 1 
git 450 6935 1 



N0V9 
NOVSb 

gll 66779391 
git 9368520 1 
git 545100 1 
git 4506935 1 



200 
..1,. 



210 



220 



230 



240 



IHjNGPGSSLRKLIU^CRLDYGGGSGBPGGVQRAFSASSASGAAGCCCASSGAGAAASSS 



250 



260 



270 



2$0 



290 



300 



.1. 



.1 . 



-Dili;Sf^^EP6|AnPGDiKNPGDAj|Y 
-D|BSRLfiP^AflpGI#lNPGDA|Y 
-CgNGRABp||VT§SGEPK -Y 



SS 3G3 PHLYRSS SERRPATPABVRYI SPKHRLIKVESjIjAGGGAI 
310 320 330 340 



[GG P 

!G — y 



350 



360 
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N0V8 
NOVBb 

git 66779391 
gl I 9366520 1 
gl 1 545100 1 
gi I 4506935 1 



HOV8 
NOVBb 

gil €677939] 
gl 1 9368520 1 
gi 1 5451001 
gi I 4506935 1 



NOVB 
MOVeb 

gi 1 66779391 
gi 1 93685201 
gi 1 545100 1 
gi 1 4506935 1 



K0V8 
N0V8b 

gi I 6677 939 1 
git 93685201 
git 5451001 
git 4506935 1 



450 460 




SPEFCGILG 
SPEFCGILS 



ivK7")J5-^'^i giJ'iivi^i-wiAtG!:"; B[i Af^Ai'Siii r;K(AAr 1 ,v K !,:";!■;'■ | 
F: r L ' fii Kcf mn [ ■Ida l lB i c k :i y l v f H r; f ; 'i 




550 



560 



570 5B0 



590 600 




303 SQGPCTLAAKPERGQGDP 



10 



15 



The presence of identifiable domains in NOV8 was determined by searches using 
algorithms such as PROSITE, Blocks, Pfam, ProDomain, Prints and then determining the 
Interpro number by crossing tiie domain match (or numbers) using the Interpro website 
(http : www,ebi.ac.uk/interpn>/), 

DOMAE^^ results for N0V8 were collected from the Conserved Domain Database 
(CDD) with Reverse Position Specific BLAST. TTus BLAST samples domains found in the 
Smart and Pfam collections. The results are listed in Table 8H with the statistics and domain 
description. The results indicate that N0V8 contains Src homology 2 domain (gul|SraartlSH2, 
Src homology 2 domain) at amino acid positions 239-323, which align with residues 1-85 of 
this domam (SEQ ID NO:84). This indicates that the sequence of N0V8 has properties 
similar to those of other proteins known to contain this domain. N0V8b also shows homology 
to this domain, with an E value of 3, le-22. Src homology 2 domains bind phosphotyrosine- 
containing polypeptides via 2 surface pockets. Specificity is provided via interaction with 
residues that are distinct from the phosphotyrosine. 
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Table 8H. DOMAIN resuta for NOV8 



CD-Length =85 residues, 100.0% aligned 
score - 86.3 bits (212), Expect = 2e-18 




K0V8 iLfflSSRFiF 
Qnl I Smart | 9H2 BeBqKHsIg 



The Src homology 2 (SH2) is a protein domain of about 85 amino-acid residues first 
identified as a conserved sequence region between the oncoproteins Src and Fps. Pawson et 
5 al*, MoL Cell* Biol 6:4396-4408, 1986. Similar sequences were later found m many other 
intracellular signal-transducing proteins. Barton et al, FEBS Le|t. 304: 15-20, 1992. SH2 
domains function as regulatory modules of intracellular signaling cascades by interacting with 
high affinity to pho^hotyrosine-^ontaining target peptides in a sequence-specific and strictly 
phosphorylation-dependent manner, Pawson and Schlessinger, Curr. BioL 3:434-442, 1993; 

10 Baltimore and Mayer, Trends Cell Biol. 3: 8-13, 1993; Pawson, Nature 373: STS-SSO, 1995. 
They are found in a wide variety of protein contexts e.g., in association with catalytic domains 
of phospholipase Cy 0?LCy) and the nonreceptor protein tyrosine kinases; within structural 
proteins such as fodrin and tensin; and in a group of small adaptor molecules, i.e. Crk and 
Nek. In many cases, when an SH2 domain is present so too is an SH3 domain, suggesting that 

1 5 their functions are inter-related. 

Adaptor proteins link catalytic signaling proteins to cell surface receptors or 
downstream effector proteins. Using a subtractive hybridization strategy to identify genes that 
are specifically expressed in activated CD8+ T cells, Spurkland et al. (J. Biol. Chem. 273: 
4539-4546, 1998) isolated cDNAs encoding SH2D2A, which they named TSAD. The 

20 predicted 389-amino acid SH2D2A protein contmns an Src homology-2 (SH2) domain, 

putative SH3 domain-binding motifs, and putative phosphotyrosine-binding domain (PTB)- 
binding motifs, but no known catalytic domains. The authors also isolated cDNAs 
representing alternatively spliced SH2D2A transcripts that encode deduced 361- and 399- 
amino acid proteins. Northern blot analysis detected an approximately 1.7-kb SH2D2A 

25 transcript in peripheral blood leukocytes, thymus, and spleen. SH2D2A was expressed in 

activated T cells, but not in restmg T cells or in B cells. Its expression was rapidly induced 

after activation of T cells. Antiserum raised against SH2D2A reacted with a 52-kD protein on 

Western blots of T-cell lysatcs. Recombinant SH2D2A expressed in mammalian cells 
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localized to the cytoplasm. Spurkland et al. (J. BioL Chem. 273: 4539-4546, 1998) showed 
that SH2D2A is tyrosine-phosphorylated in vivo. They suggested that SH2D2A is an adaptor 
protein involved in T cell signaling. 

By searching an EST database for sequences with signal transduction motifs, Lu et al* 
5 (J. Biol. Chem. 274: 1004740052, 1999) identified a cDNA encoding a deduced 698-aniino 
acid protein, which they named NSP3 (novel SH2-containing protein-S). Sequence analysis 
revealed that NSP3 also contains a potential SH3 interaction domain. Northern blot analysis 
detected significant levels of a 3.2- and a 3.8-kb NSP3 transcript in a wide variety of tissues* 
Fiurther, Lu et al, (supra) also identified a cDNA encoding a deduced 576-ammo acid 

10 protein, which they named NSPl (novel SH2-contaimng protein-1). Sequence an^ysis 

revealed that NSPl also contains a potential SH3 interaction domain. Northern blot analysis 
detected significant levels of a 2,7-kb NSPl transcript only in placrata, pancreas, kidney, lung, 
fetal kidney, and fetal lung. Treatment with insulin or epiderm^ growth factor (EGF) resulted 
in rapid tyrosine phosphorylation of NSPl and increased association of the 64-kD NSPl with 

15 pl30-Cas. In contrast, contact With fibronectin resulted in little phosphorylation of NSPl but 
increased phosphorylation of the pl30-Cas associated with NSPl. The authors determined 
that expression of NSPl leads to activation of the stress-activated protein kinase JNKl 
(MAPK8) but not ERK2 (MAPKl), 

Many proteins involved in the regulation of cellular proliferation contain sequence 

20 motifs are named SH2 and SH3. Pawson and Gish, Cell 71 : 359-362, 1992, These domains 
mediate interaction with other proteins; the SH2 domain interacts with tyrosine . 
phosphorylation sites, while SH3 domains interact with proline-rich sequences. Many signal 
transduction pathways involve the induction of the formation of complexes of proteins such as 
growth factor receptors, adaptor proteins, and target enzymes through SH2 and SH3 

25 interactions. Adaptor proteins are molecules wth multiple protein interaction motifs that do 
not appear to have catalytic activity of their own but mediate the interaction of other proteins. 
The SHB gene encodes two such adaptor proteins (fi:om two different start methionines) of 67 
and 56 kD. Welsh et al., Oncogene 9: 19-27, 1994. By PGR analysis of a somatic cell hybrid 
mapping panel, Yulug et al. (Genomics 24: 615-617,1994) mapped the SHB gene to 

30 chromosome 9. By fluorescence in situ hybridization, they regionalized the gene to 9pl2-pl 1 . 
Oda et al. (Oncogene 1 1 : 1255-62, 1997) used a yeast two hybrid screen to identify 
proteins binding to the Abl tyrosine kinase in order to understand the molecular mechanism of 
Bcr-Abl mediated transformation. Two partial cDNAs encoding novel SH2 domain- 
containing proteins were cloned and designated SHD and SHE, Both have homology to SHB, 
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a previously reported SH2 domain-contaimng protein. Northern blot analysis showed that 
SHE is expressed in heart, lung, brain, and skeletal muscle, while expression of SHD is 
restricted to the brain, The deduced amino acid sequence of the full length mouse SHD cDNA 
contains an amino-temiinal proline-rich region, and a carboxy-terminal SH2 domain. A 
bacterially expressed SHD domain bound multiple tyrosine-phosphorylated proteins with 
relative molecular weights of 200, 170, 130, 100, 90, 78, 72 and 32 kDa j&ora K562 cell 
lysates. SHD contains five YXXP motifs, a substrate sequence preferred by Abl tyrosine 
kinases. The domains are frequently found as repeats in a single protein sequence. The 
structure of the SH2 domain belongs to the alpha+beta class, its overall shape forming a 
con:5)act flattened hemisphere. The core structural elements comprise a central hydrophobic 
anti-parallel beta-sheet, flanked by 2 short alpha-helices. In the v-src oncogene product SH2 
domain, the loop between strands 2 and 3 provides many of the binding interactions with the 
pho^hate group of its phosphopeptide ligand, and is hence desiguated the phosphate binding 
loop. SHD was tyrosine phosphorylated in COS-7 cells co-transfected with SHD and c-Abl or 
Bcr^-Abl. These results suggest that SHD may be a physiological substrate of c-Abl and may 
fimction as an adapter protein in the central nervous system. 

The similarity information for the N0V8 protein aad nucleic acid disclosed herem 
suggest that N0V8 may have hnportant structural and/or physiological functions characteristic 
of the src homology domain (SHD) family. Therefore, the nucleic acids and proteins of the 
invention are useful in potential diagnostic and therapeutic apphcations and as a research tool. 
These include serving as a specific or selective nucleic acid or protem diagnostic and/or 
prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to be 
assessed, as well as potential therapeutic applications such as the following: (i) a protein 
therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, 
drug targcting/cytotoxic antibody), (iv) a nucleic acid usefiil in gene thcr^y (gene 
dehvery/gene ablation), and (v) a composition promoting tissue regeneration in vitro and in 
vivo (vi) biological defense weapon. The novel nucleic acid encoding N0V8, and the N0V8 
protein of the invention, or fragments thereof may further be useful in diagnostic apphcations, 
wherein the presence or amount of the nucleic acid or the protein are to be assessed 

The disclosed NOV8 nucleic acids and proteins of the invention are useful in potential 
therapeutic ^plications implicated in caac^ and lymphoproliferative syndrome, as well as, 
VonHippel-Lindau (VHL) syndrome, Alzheimer's disease, stroke, tuberous sclerosis, 
hypercalceimia, Parkinson's disease, Huntington's disease, cerebral pdsy, epilepsy, Lesch- 
Nyhan syndrome, multiple sclerosis, ataxia-telangiectasia, leukodystrophies, behavioral 
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disorders, addiction, anxiety, pain, neuroprotection, myasthenia gravis, and other and/or other 
pathologies and disorders. 

For example, a cDNA encoding the SHD-like protein may be useful in gene therapy, 
and the SHD-like protein may be useful when administered to a subject in need thereof^ By 
way of nonlimiting example, the compositions of the present invention will have efficacy for 
treatment of patients suffering from cancer, lymphoproliferative syndrome. Von Hippel- 
Lindau (VHL) syndrome, ALsheimer*s disease, stroke, tuberous sclerosis, hypercalceimia, 
Parkinson's disease, Huntington's disease, cerebral palsy, epilepsy, Lesch-Nyhan syndrome, 
multiple sclerosis, ataxia-telangiectasia, leukodysfa-ophies, behavioral disorders, addiction, 
anxiety, pain, neuroprotection, myasthenia gravis, and other and/or other pathologies and 
disorders. The novel nucleic acid encoding SHD-like protein, and the SHD-like protein of the 
invention, or fragments thereof, may further be useful in diagnostic applications, wherein the 
presence or amount of the nucleic acid or the protein are to be assessed These materials are 
fijrther useful in the generation of antibodies that bind immuno-specifically to the novel NO V8 
substances for use in therapeutic or diagnostic methods. These antibodies may be generated 
according to methods known in the art, using prediction from hydrophobicity charts, as 
described in the "Anti-NOVX Antibodies" section below. For example the disclosed N0V8 
protein has multiple hydrophilic regions, each of which can be used as an immunogen. These 
novel proteins can also be used to develop assay system for functional analysis. 

NOV9 

A disclosed novel N0V9 nucleic acid is 2031 nucleotides long (also refeiied to as 
AI284055_EXT) is shown in Table 9A (SEQ ID NO:23). An ORF begins with an ATG 
initiation codon at nucleotides 1-3 and ends with a TGA codon at nucleotides 2029-2031. The 
start and stop codons are in bold letters in Table 9A. 
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Table 9A. NOV9 Nuckatlde Sequence (SEQ ID NO:23) 

GGATCGACGACAXCGCGGATGGCGCCGTGAAGCCCCC^iCCCAA(^GTACCCCATCTTTTTCTTTGGC^^ 
ACACGAAACGGCCTTCCTGGGACCCAAGGACCTGTTCCGCTACGACAAATGTAAAGACMGTACGGGAAG 
CCCAACAAGAGGAAAGGCTTCmTGAAGGGCTGTGGGAGATCCAGAACAACCCCCACGC^ 
CCCCTCCGCCAGTGAGCTCCTCCGACAGCGAGGCCCCCGAGGCCAACCCCGCCGACGGCAGTGACGCTGA 
CGAGGACGATGAGGACCGGGGGGTCATGGCCGTCACAGCGGTAACCGCCACAGCTGCCAGCGACAGGATG 
GAGAGCGACTCAGACTCAGACAAGAGTAGCGACAACJU3TGGCCTGAAGAGGAAGACGCCTGCGCTA 
rATCGGTCTCGAAACGAGCCCGAAAGGCCTCCAGCGACCTGGATCAGGCCAGCGTGTCCCCATCCGAAGA 
GGAGAACTCGGAAAGCTCATCTGAGTCG6AGAAGACCAGCGACCAGGACTTCACACCTGAGAAGAAAGCA 
GCGGTCCGGGCGCCACGGAGGGGCCCTCTGGGGGGACGGAAAAAAAAGAAGGCGCCATCAGCCTCCGACT 
CCGACTCCAAGGCCGATTCGGACGGGGCGAAGCCTGAGCCGGTGGCCATGGCGCGGTCGGCGTCCTGCTC 
CTCCTCTTCCTCCTCCTCCTCCGACTCCGATGTGTCTGTGAAGAAGCCTCCGAGGGGCAGGAAGCCAGCG ^ 
GAGAAGCCTCTCCCGAAGCCGCGAGGGCGGAAACCGAAGCCTGAAa;iGCCTCCGTCa\GGTCCAG^^ 
ACAGTGACAGCGACGAGGtGGACCGCATCAGTGAGTGGAAGCGGCGGGACGAGGCGCGGAGGCGCGAGCT 
GGAGGCCCGGCGGGGGCGAGAGCAGGAGGAGGAGCTGCGGCGCCTGCGGGAGCAGGAGAAGGAGGAGAAG 
GAGCGGAGGCGCGAGCGGGCCGACCGCGGGGAGGCTGAGCGGGGCAGCGGCGGCAGCAGCGGGGACGAGC 
TCAGGGAGGACGATGAGCCCGTCAAGAAGCGGG{5ACGCAAGGGCCGGGGCCGGGGTCCCCCGTCCTCCTC 
TGACTCCGAGCCCGAGGCCGAGCTGGAGAGAGAGGCCAAGAAATCAGCGAAGAAGCCGGAGTCCTCAAGC 
ACAGAGCCCGCCAGGAAACCTGGCGAGAAGGAGAAGAGAGTGC6GCCCGAGGAGAAGCAACAAGCCAGGC 
CCGTC^GGTGGAGCGGACCCGGAAGCGGTCCGAGGGCTTCTCGATGGACAGGAAGGTAGAGAAGAAGAA 
AGAGCCCrrCCGTGGAGGAGAAGCTGCAGAAGCTGCACAGTGAGATCAAGTTTQCCCTAAAGGTCGACAGC 
CCG6ACGTGAAGAGGTGCCTGAATGCCCTAGA6GAGCTGGGAACCCTGCAGGTGACCTCTCAGATCCTCC 
AGAAGAACy^CAGACGTGGTGGCCACCTTGAAGAAGATTCGCCGTTACAAAGCGMCAAGGACGTAATC 
GAAGGCAGCA6AAGTCTATACCCG6CTCAAGTCGCGGGTCCTCGGCCCAAAGATCGAGGCGGTGCAGAAA 
GTGAACAAGGCTGGGATGGAGAAGGAGAAGGCCGAGGAGAAGCTGGCCGGGGAGGAGCTG6CCX3GGGAGG 
AGCTGGCGGGGGAGGAGGCCCCCCAGGAGAAGGCGGAGGACAAGCCCAGCACCGATCTCTCAGCCCCAGT 
GAATGGCGAGGCCACATCACAGAAGGGGGAGAGCGGAGAGGACAA6GAGCACGAGGAGGGT?CGGGACTCG 
GAGGAGGGGCCy^GTGTGGCTCCTCTGAAGACCTGCACGAGAGCGTACGGGAGGGTCCCGACCTGGACA 
GGCCTGGGAGCGACCGGCAGGAGCGCGAGAGGGCACGGGGOSACTCGGAGGCCCTGGACGAGGAGAGCTmW 



A disclosed NOV9 protein encoded by SEQ ID NO:24 has 676 amino acid residues, 
and is presented using the one-letter code in Table 9B (SEQ ID NO:24). The SignalP, Psort 
5 and/or Hydropathy profile for N0V9 predict that N0V9 has no signal peptide and is likely to 
be localized at the nucleus with a certainty of 0.9866; the mitrochondrial matrix space with a 
certainty of 0.1000; the lysosome (lumen) with a certainty of 0.1000; and the endoplasmic 
reticulum (membrane) with a certamty of 0.0000. 

The disclosed NOV9 protein is similar to the Mus musculus hepatoma-derived growth 
1 0 factor, related protein 2 (SPTREMBL"ACC:O3S540). 



Table 9B. Encoded NOV9 protein sequence (SEQ ID NO:24). 

mphafkpgdlvfakmkgyphwpariddiadgavkpppnkypifffgthetaflgpkdlfpydkSdK^ 
pnkrkgfneglweiqnnphasysappfvsssdseapeanpadgsdadeddedrgvmavtavtat^ 

ESDSD3DK3SDNSGLKRKTPALKVSVSKRARKASSDLDQASVSPSEEENSESSSESEKTSDQDFTPEKKA 

AVIUIPRRGPLGGRKKKKAP3ASDSDSKADSW3AKPEPVAMARSA3SS3SSSSSSDSDVSVKKPPRGRKPA 

EKPLPKPRGRKPKPERPPSSSSSDSDSDEVDRISEWKRRDEARRRELSARRRREQEEELRRLREQEKEEK 

ERRRERADRGEAERGSGGSSGDELREDDEPVKKRGRKGRGRGPPSSSDSSPEAELEREAKKSAKKPQSSS 

TEPARKPGQKEKRVRPEEKQQARPVKVERTRKRSEGFSMDRKVEKKKEPSVEEKLQKLHSEIKFA^ 

PDVKRCLNALEELGTLS^VTSQILQKNTDWATLKKIRRYKANKDVMEKA^ 

VNKAGMEKEKAEEKLAGEELAGEELAGEEAPQEKAEDKPSTDLSAPVNGEATSQKGESAEDKEHEEGRDS 
E£GPRCGSSEDLHE3VREGPDLDRP(33PRQERERARGDSEALDE£5 
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Hepatoma-derived growth factor (HDGF) and HDGF-related proteins (HRP) belong to 
a gene family with "a well-conserved amino acid sequence at the N-terminus (the hath region). 
A new member of the HDGF family in humans and mice was identified and cloned; we call it 
HRP-3. The deduced amino acid sequmce from HRP-3 cDNA contained 203 amino acids 
without a signal peptide for secretion. HRP-3 has its 97-amino-acid sequence at the N- 
terminus, which is highly conserved with the hath region of the HDGF family proteins. It also 
has a putative bipartite nuclear localizmg signal (NLS) sequence in a similar location in its 
self-specific region of HDGF and HRP-1 . Northem blot analysis shows that HRP-3 is 
expressed predominantly in the testis and brain, to an intermediate extent in the hearty and to a 
slight extent in the ovaries, kidneys, spleen, and liver in humans. Transfection of green 
fluorescent protein (GFP)"tagged HRP-3 cDNA showed that HRP-3 translocated to the 
nucleus of 293 cells. GFP-HRP-3 transfectants significantly increased their DNA synthesis 
more than cells transfected with vector only. The HRP-3 gene #as mapped to chiDmosome 15, 
region q25 by FISH analysis. These findings suggest that a new member of the HDGF gene 
family, HRP-3, may function mainly in the nucleus of the brain, testis, and heart, probably for 
cell proliferation. See Ikegame et al., Biochem Biophys Res Commun 266(l):81-87 (1999). 

Hepatoma-derived growth factor (HDGF)-related protein (HRP)-l, a member of the 
HDGF gene family, showed testis-specific expression in mice, HRP-1 expression in 
spennatogenesis was analyzed in the testis of normal and azoospermic mice by Northem blot 
and immunohistochemistry, HRP-1 gene message was not exparessed in the ovary and its 
product was detected only in the nuclei of germ cells, not in somatic cells. The HRP-1 gene is 
expressed through pachytene spermatocyte to round spennatid, HRP-1 gene expression was 
not detected in the testis of cryptorchid mice or in some strains of mutant mice. These findings 
suggest that the testis-specific HRP-1 gene may play an important role in the phase around 
meiotic cell division. See Kuroda et al, Biochem* Biophys Res Commun 262(2):433-37 
(1999). 

Hepatoma-derived growth factor (HDGF) is an acidic polypeptide with mitogenic 
activity for fibroblasts performed outside the cells despite the presence of a putative nuclear 
localization signal (NLS). Three related mouse cDNAs have been cloned: one for a mouse 
homologue of human HDGF and two for additional HDGF-related proteins provisionally 
designated HDGF-related proteins 1 and 2 (HRP-1 and ^2). Their deduced sequences have 
revealed that HDGF belongs to a new gene family with a highly conserved 98-amino-acid 
sequence at the amino terminus (hath region, for homologous to the amino terminus of 
HDGF). HRP-1 and HRP-2 protems are 46 and 432 amino acids longer than mouse HDGF, 
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respectively, and have no conserved amino acid sequence other than the hath region. HRP-1 is 
a highly acidic protein (26% acidic) and also has a putative NLS. HRP-2 protein carries a 
mixed charge cluster, a sharp switch of positive-to negative-charge residues, which is often 
found hx some nuclear proteins. Northern blotting shows that mouse HDGF and HRP-2 are 
expressed predominantly in testis and skeletal muscle, to intemiediate extents in heart, brain, 
lung, liver, and kidney, and to a minimal extent in spleen. HRP-1 is expressed specifically in 
testis. These findings suggest that the HDGF gene family might play a new role in the nucleus 
especially in testis. See Izumoto et al., Biochem Biophys Res Commun 238(l):26-32 (1997), 

Hepatoma-derived growth factor (HDGF) is the first member identified of a new 
family of secreted heparin-binding growth factors highly expressed in the fetal aorta* The 
biologic role of HDGF in vascular growth is unknown. Here, HDGF mRNA is expressed in 
smooth muscle cells (SMCs), most prominently in proliferating SMCs, 8-24 hours after serum 
stimulation. Exogenous HDGF and endogenous overexpression df HDGF stimulated a 
significant increase in SMC number and DNA synthesis. Rat aortic SMCs transfected with a 
hemagglutinm-epitope-tagged rat HDGF cDNA contain HA^HDGF in their nuclei during S- 
phasa Native HDGF was detected in nuclei of cultured SMCs, of SMCs and endotheUal cells 
from 19-day fetal (but not in the adult) rat aorta, of SMCs proximal to abdommal aortic 
constriction in adult rats, and of SMCs in the neointima fomied after endothelial denudation of 
the rat common carotid artery. Moreover, HDGF colocalizes with the prolifraating cell nuclear 
antigen (PCNA) in SMCs in human atherosclerotic carotid arteries, suggesting that HDGF 
helps regulate SMC growth during development and in response to vascular injury. See 
Everett et aL, J, Clin Invest 105(5):567-75 (2000). 

In the kidney, there is a close and intricate association between epithelial and 
endothelial cells, suggesting that a complex reciprocal mtcraction may exist between these two 
cell types during renal ontogeny. Thus, it was exammed whether metanephrogenic 
mesenchymal cells secrete endothehal mitogens. With an endotheUal mitogenic assay and 
sequential chromatography of the proteins m the media conditioned by a cell line of rat 
metanephrogenic mesenchymal cells (7.1.1 cells), a protein whose amino acid analysis 
identified it as hepatoma-derived growth factor (HDGF) was isolated. Media conditioned with 
CoS''7 cell transfected with EDDGF cDNA stimulated endotheUal DNA synthesis. With 
immunoaffinity purified antipqptide antibodies, HDGF was found to be widely distributed in 
the renal anlage at early stages of development but soon concentrated at sites of active 
morphogenesis and, except for some renal tubules, disappeared ft-om the adult kidney, From a 
7.1,1 cells cDNA Ubrary, a clone of most of the translatable region of HDGF was obtained and 
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used to synthesize digoxigedn-labeled riboprobes. In situ hybridization showed that during 
kidney development mRNA for HDGF was most abundant at sites of nqihron moiphogenesis 
and in ureteric bud cells while in the adult kidney transcripts disappeared except for a small 
population of distal tubules. Thus, HDGF is an endothelial mitogen that is present m 
embryonic Iddney, and its expression is synchronous with nephrogenesis. See Oliver et al., J. 
Clin Invest 102(6): 1208-19 (1998). 

A human hepatoma cell Ime synthesizes, as evidenced by metabolic labeling, an 
endothelial ceU mitogen that is found to be mostty cell associated. The hepatoma-derived 
growth factor (HDGF) has been pxmfied to homogeneity by a combination of Bio-Rex 70, 
heparin-Sepharose, and reverse-phase chromatography; it is a cationic polypeptide with a 
molecular weight of about 18,500-19.000. HDGF is stmcturally related to basic fibroblast 
growth factor (FGF). Immunological analysis demonstrates that antiserum prq)ared against a 
synthetic peptide corresponding to the amino-terminal sequence ()/ basic FGF cross-reacts 
with HDGF when analyzed by electrophoretic blotting and by immunoprecipitation. Sequence 
analysis of tryptic fragments demonstrates that HDGF contains sequences that are homologous 
to both amino-terminal and carboxyl-terminal sequences of basic FGF. figg Klagsbrun et al., 
Proc Natl Acad Sci USA 83(8):2448-52 (1986). 

According to the OMIM database eaaty 300043 for hepatoma-derived growfli factor, 
Nakumura et al. purified a novel hepatoma-derived growth factor from the conditioned 
medium of human hepatoma-derived cell line HuH-7. See Nakamura ct al., J Biol Chem 
269:25143-49 (1994). Molecular clonmg of a cDNA fimn the cDNAHbraiy of the same cell 
Une was done on the basis of the N-terminal amino acid sequence. The cDNA was 2.4 kb long 
and the deduced amino acid sequence contained 240 amino acids without a signal peptide-like 
N-terminal hydrophobic sequence. The primary sequence shared homology with the high 
mobility group-1 protein (See OMIM database entry 163905); they showed 23.4% amino acid 
identity and 35.6% amino acid similarity. Immunofluorescence study showed that HDGF is 
locaUzed in the cytoplasm of hepatoma cells and northem blots showed that it is expressed 
ubiquitously in normal tissues and tumor cell lines. Nakamura et al. (1994) suggested toat it is 
a novel heparin-bmding protem with mitogenic activity for fibroblasts. 

HDGF is ubiquitously expressed in normal tissues and tumor cell lines. By PGR 
screening of a commercial monochromosomal hybrid panel, Wanschura et al. (1996) mapped 
HDGF to the X chromosome. Sgg Wanschura et al, Gaiomics 32:298-300 (1996). By 
fluorescence in situ hybridization, they determined the subchromosomal localization to be 
Xq25 . Whereas a major group of the HMG protein family has been mapped to chromosomal 
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segments frequently involved in the tumorigeoesis of benign solid tumors, no tumor 
association for the Xq25 region was known, 

N0V9 is very likely a nuclear localized peptide as the N0V9 polypeptide is similar to 
the hepatoma-derived growth factor related protein gene family, some members of which are 
localized in the nucleus. Hepatoma-derived growth factor related protein genes are 
temporarily available extracellularly for growth factor signaUng. Therefore, it is likely that 
this novel gene is available at the appropriate subcellular localization and hence accessible for 
the theriapeutic uses described in this application. 

This invention describes the following novel hqpatoma-derived growth facto^ related 
protein-like protein md nucleic acid encoding same (designated CuraGen Accessirai Number 
A12840S5_EXT). This sequence was initially identified by searching public genomic 
databases for DNA sequeshces that translate into proteins with similarity to a proteb family of 
interest. SeqCalUng assembly AI284055 (derived from an Lnafee clone) was identified as 
having suitable similarity. SeqCalling assembly AI284055 was analyzed further to identify an 
open reading frame encoding for a novel' fiill length protein and novel splice forms of this 
gene. 

The genomic clone ACOl 1498 was analyzed by GenScan and Grail to identify exons 
and putative coding sequences/open reading frames. The clone ACOl 1498 was also analyzed 
by TblastN, BlastX and other homology programs to identify regions translating to proteins 
with similarity to the original protein/protein family of interest. 

The results of these analyses were integrated and manually corrected for apparent 
inconsistencies, thereby obtaining the sequence encoding the full-length protein. When 
necessary, the process to identify and analyze cDNAs/ESTs and genomic clones was reiterated 
to derive the full-length sequence. This invention describes this fiill-length DNA sequence(s) 
and the full-length protein sequence(s) which they encode. 

The gene encoding N0V9 belongs to genomic clone ACOl 1498 on Chromosome 19. 

Based on infonnation available from the expression of ESTs with 100% homologous 
sequence to AI284055_EXT, it is highly probable that N0V9 is expressed in, for example, but 
not limited to, blood, brain, colon, esophagus, foreskin, germ cell, lung, nose, ovary, pancreas, 
prostate, spleen, tonsil, uterus, and lung. 
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Patp results for N0V9 include those listed in Table 9C. 
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Table 9C, Patp alignments of NOV9 



Smallest 
Sum 

Reading High Probab. 

Sequencea producing High- acoring Segment Pairs: Frame Score P(N) 



patp:Y99426 Human PRO1604 (UNQ785) amino acid sequence... +1 3406 TTo" 

patp:W37483 Mouse liver cancer-originated culture cell- +1 2769 1.76-287 

patp:B53322 Human colon cancer antigen protein sequence. +1 2261 1.9e-257 

patp;B41868 Human ORFX ORF1632 polypeptide sequence ... +1 1496 l!4e-152 

patp:B42974 Human ORFX 0RF2738 polypeptide aequence ... +i 1068 3!le-107 

patp:B13522 Human hepatoma-derlved growth factor homolog... -t-l 543 i:3e-51 



For example, a BLAST against Y99426, a 671 amino acid hepatoma-derived growth factor 
from Homo sapiens, produced 668/676 (98%) identity, and 671/676 (99%) positives (E = 0.0), with 
long segments of ammo acid identity, as shown in Table 9D. WO 00/12708-A2, 



Table 9D. Blast Results of NOV9 and Y99426 (SEQ ID NO;85) 



Score =^ 3406 (1199.0 bits), Expect »- 0.0, P = 0.0 ~~~~ 

Identities - 665/676 (98%), Positives - 671/676 (99%), Frame = +1 

NOV9: 1 HPHAFKPGDLVFAKMKGYPHWPTVRIDDlADGAVKPPPbJKYPIFETGTHETAFLGPKDLFP 60 

, ilHitlitlllllllllllillllMIMIlItllllillllllllllll !ii 

Y99426: 1 MPHAFKPGDLVFAKMK(^PHWPJU^IDDIADGAVKPPPNKYPlFFFGTHBra 60 
N0V9: 61 YDKCKDKYGKPNKRKGFNEGLMElONNPHASYSAPPPVSfiSDSEAPBANPADGSDADEDD 120 

^nn.n^ I I 1 Ml I i I I I t 1 I I I I 1 I I I 1 | 1 | i I I I I I I 1 I I I I t I I I t 1 t t 1 t I 1 t I I ) I I 

Y99426: 61 YDKCKDKYGKPNKRKGFNEGLWEIQNNPHASYSAPPPVSSSDSEAPEANPADGSDADEDD 120 

NOV0: 121 EDRGVMAVTAVTATAASDRMESDSDSDKSSDNSGLKRKTPALKVSVSKRARKASSDLDQA 180 
vao.o^ ..1 IHIinil|llI||l|l||tI|[||i|lin||||l|l|l|| + |||||||,,t,,,,|, 
Y99426: 121 EDRGVMAVTAVTATAASDI^ESDSDSDKSSDNSGLKRKTPALKMSVSBaUVRKASSDLDQA 180 

N0V9: 181 SVSPSEEBNSESS3ESBKT3[X3DFTPEa(KAAVRAPRRGPLGGRKKKKAP3ASDSD3KADS 240 
vn..o. ... nillllll!|ll||||j|t||||ill|l||ll||l|||l|||||||ll|lll)||))|) 
Y99426; 181 SVSPSEEENSESSSESEKTSDQDrTPEKKAAVRAPRRGPLGGRKKKECAPSASDSDSKADS 240 

N0V9: 241 DGAKPEPVAMARSASSSSSSSSSSDSDVSVKKPPRGRKPAEKPLPKPRGRKPKPERPPSS 300 

v.n... . . li'iMiitiiiiiiittMiiitiiiiiititniiiiiiiiiiiiiiiiiiiiiiiii 

Y99426: 241 DGAKPEPVAMARSA5SSSS3SSSSDSDVSVKKPPRGRKPABKPLPKPRGRKPKPERPPSS 300 
N0V9: 301 SSSDSDSDEnrDRlSBWKRRDBARRRBIiEARRRREQEEEIiRRIJlEQEKEEKERRRERADRG 360 

voa.o. niMIIIIilllllllltlMIIIIIIitlllllillDIIIIIIIIIItllllltltl 

Y99426: 301 SSSDSDSDEVDRISEWKRRDEARRRELEARRRREQEEELRRLREQEKEEKERRRERADRG 360 
N0V9: 361 EAERGSGGSSGDELREDDEPVKKRGRKGRGRGPPSSSDSEPEAELEREAKKSAKKPQSSS 420 

iiniiittiiiiiiiiiiiiiiitiitiiiiiitiiiiiiiitiiiiiiiiiiiiiiii 

Y9942 6: 361 EAERGSGG3SGDEI*REDDEPVKECRGRKGRGRGPPSSSDSEPE7^LEREAKKSAKKPQSSS 420 
N0V9: 421 TEPARKPGQKEKRVRPEEKQQARPVKVERTRKR3EGFSMDRKVEKKKEPSVEEKLQKLHS 480 

iiiiitiniiiiiiiiniit+iiiitiiititiMMiiiiiiiiiiiiiiiiiiiii 

Y9942G; 421 TEPARKPGQKEKRVRPEEKQQAKPVKVERTRKRSEGFSMDRKVEKKKEPSVEEKLQECLHS 480 
N0V9: 481 EIKFALKVDSPDVKRCLNALEELGTLQVTSalLQKNTDVVATLKKIRRYKANKDVMEK^ 540 

^nn.o. I! 1 1 111 ill nil I mil II nil iiitniiii inniin 

Y99426: 481 EIKFALKVDSPDVKRCLNALBELGTLQVTSQILQKNTDVVATLKKIRRYKANKDVMEKAA 540 
N0V9: 541 EVYTRliKSRVLGPKIEAVQKVNKAGMEKEKAEEKIJ^EEIAGEELAGEE^ 600 

iimiiiiniiiiniiiii iinMiiiiin ninnnii 
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Y99426: 541 EVYTRLKSIRVLGPKIE^ 

N0V9: 601 TDLSAPVNGEATSQKGESJ^EDKEHEEGRDSEEGPRCGSSEDLHESVREGPDLDRPGSDRQ 660 

IIIIIMIIIIIItllllllllllllllllllllllllllltl + ltillliillllllll 
y99426: 596 TDLSAPVNGEATSQKGESAEDKEHEEGRDSEEGPRCGSSEDLHDSVREGPDLDRPGSDRQ 655 

N0V9; 661 ERERARGD3EALDESS 676 
llllllllllllllt) 

Y99426; 656 ERERARGD5EALDEB3 671 

Additionally, N0V9 also showed a large degree of homology with W37483, a 669 
amino acid mouse liver cancer-originated culture cell growth factor Specifically, a BLAST 
produced 553/676 (81%) identity, and 603/676 (89%) positives (E=1.7e-287), with long 
segments of amino acid identity. JP09313185-A. 

A BLAST against B53322, a 5 1 8 amino acid human colon cancer antig^ protein 
sequence from Homo sapiens, produced 458/465 (98%) identity and 460/465 (98%) positives 
(E=L9e^257), Avith long segments of amino acid identity from nucleic acid residues 388 to 
1782. Additionally, this BLAST produced 53/80 (66%) identity W 58/80 (72%) positives 
(E=2 Je-25) from nucleic acid residues 1677 to 1916; 64/260 (24%) identity and 1 1 1/260 
(42%) positives (E=2 Je-25) from nucleic acid residues 310 to 1089; 68/296 (22%) identity 
and 124/296 (41%) positives (E=4 Je-25) from nucleic acid residues 292 to 1161; 59/245 
(24%) identity and 101/245 (41%) positives (E=3,2e-24) from nucleic acid residues 709 to 
1443; 19/51 (37%) identity and 27/51 (52%) positives (E=1.8e-239) from nucleic acid residues 
1638 to 1790; 21/77 (27%) identity and 37/77 (48%) positives (E-2.8e-18) from nucleic acid 
residues 1 10 to 340; 18/58 (31%) identity and 28/58 (48%) positives (B=5.0e-17) from nucleic 
acid residues 195 to 368; aad 17/61 (27%) identity and 24/61 (39%) positives (E=L0e^l6) 
from nucleic acid residues 204 to 383. figg WO 00/55351-AL 

A BLAST against B41868, a 308 amino acid human ORFX polypeptide sequence, 
produced 458/465 (98%) identity and 460/465 (98%) positives (E-L9e-257), with long 
segments of amino acid identity from nucleic acid residues 1 105 to 2028. See WO 00/58473- 
A2. 

A BLAST against B42974, a 209 amino acid human ORFX polypeptide sequence, 
produced 208/209 (99%) identity and 209/209 (100%) positives (E=3.1e-107), with long 
segments of amino acid identity. See WO 00/58473~A2, 

The disclosed N0V9 protein (SEQ ID NO:24) has good identity with hepatoma- 
derived growth factors. Hie identity information used for ClustalW analysis is presented in 
Table 9E. Where indicated, there were two significant regions of homology. 
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Table 9E. BLAST results for NOV9 



Gene Index/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
<%) 


Positives 
<%) 


Expect 


Gaps 


Gil 12653923 

Igbl 

AAH00755a| 
AH00755 
{BC000755) 


Similar to 
hepatom^derived 

growth factor, 
related protein 2 
Homo sapiens 


670 
(from 

aa 
405- 
670) 


220/272 
(80%) 


222/272 
(80%) 


5e-83 


6/272 


Gil 12653923 

Igbl 
AAH00755.il 
AH0D755 
{BC000755) 


Similar to 
hepatc2ma-derived 

growth factor^ 
related protein 2 
Homo sapiens 


670 
(l- 
280) 


148/280 
(52%) 


149/280 
(52%) 


2e-53 




Gil 13277669 

Igbj 

AAH03741.il 
AAH03741 
(BC003741) 


Similar to 
hepatoma-derived 

growth factor, 
related protein 2 
Miis mv$culus 


678 
(426- 
675) 


167/256 
(65%) 


197/256 
(76%) 


3e-64 


6/256 

) 


Gil 13277669 

Igbl 

AAH0374ia| 
AAH03741 
(BC003741) 


Similar to 
hepatoma-derived 

growth factor, 
related protein 2 
Mus musaulua 


678 
(1- 
208) 


124/209 
(59%) 


* *■ V / i y 3 

(59%) 


la-. A C 


1/209 
(0%) 


Gil 66802011 

ref i 
NP 032259,1 


Hepatoma-derived 

growth factor, 
related protein 2 
Mus mu3 cuius 


669 


167/256 
(65%) 


1^7-/256 
(76%) 


7e^64 


6/256 
(2%) 



This information is presented graphically in the multiple sequence alignment given in 
Table 9F (with N0V9 being shown on line 1) as a ClustalW analysis comparing N0V9 with 
related protein sequences. 



Table 9F Information for the ClustalW proteins: 

1) NOV9(SEQmNO:24) 

2) gill2653923|gb|AAH00755 J |AAH00755 (BC000755) Similar to hepatomaHJcrivcd tjrowth fector, related protein 2 
(Homo sapiens) (SEQ ID NO:86) 

3) gi|13277669|gb|AAH0374M|AAH03741 (BC003741) Similar to hepatoma-derived growth fector. related protein 2 
(Mus mmculus) (SEQ K> NO: 87) 

4) gi|6680201|rcf|N?_032259.]| Hepatoma-derived growth factor, related protein 2 {Mus musculus) (SBQ E) NO:88) 




wo 01/74851 



PCT/USO 1/10039 



gi. 1 12653923 1 
gi I 132776691 
g±| 6680201] 



N0V9 

gi 1 12653923 
gi 113277669 
gi 1 6680201 1 



N0V9 

gi I 12653923 
gi 1 13277669 
gi 1 6680201 ( 



H0V9 

gi 112653923 
gi|132776«9 
git 6680201 1 



N0V9 

gi 1 126539231 
giU3277669 
gi I 6660201 1 



woy9 

git 12653923 
gi 1 13277669 1 
git 66802011 



H0y9 

gi 1 126539231 
gi|13277669| 
gi I 66802011 



N0V9 

gl 112653923 1 
gi I 13277669 
giI66BD201| 



NOV9 

gi 1 12653923 1 

gi| 13277669 I 
gi|66B0201| 



NOV9 

gi 1 12 653923 1 
gi 1 13277669 I 
gi 1 6680201 




The presence of identifiable domains in N0V9 was determined by searches using 
algorithms such as PROSITE, Blocks, Pfam, ProDomain, Prints and then determining the 
Interpro number by crossing the domain match (or numbers) usirig the Interpro website 
(http:www.ebi.ac.uk/interpro/). 
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DOMAIN results for N0V9 were collected from the Conserved Domain Database 
(CDD) with Reverse Position Specific BLAST. IMs BLAST samples domains found in the 
Smart and Pfam collections. The results are Usted in Table 9G with the statistics and domain 
description. The results indicate that this protein contains the foUowmg protein domains (as 
defined by Inteipro) at the indicated positions: PWWP domain. This indicates that the 
sequence of N0V9 has properties similar to those of other proteins known to contain this 
domain and similar to the properties of this domain. 



Table 9G. DOMAIN results for NOV9 


Domain 


Name 


Score (bits) 


E Value 


Gn 1 pf am 1 pf amO 0 8 55 


PWWP^ PWWP dcanain 


97.1 


2e-21 


Gn 1 Smart | PWWP 


Doinain with conserved 
PWWP motif, conservation 
of Pro-Trp-Trp-Pro 
residues 


73.2 

L 





For example, the results of a BLAST of amino residues 5-76 of N0V9 against the 74 
amino acid long domain gnl|Pfem|pfam00855 (SEQ ID NO:89) are shown m Table 9H. 



Table 9H. BLAST of NOV9 against gnl|Pfam|pfam00855 



CD-Lsngth - 74 residues, 98.6* aligned 
Score =97.1 bits (240), Expect = 2e-21 



N0V9 

Gnl I Pt{U»|p£aM)0855 
N0V9 

Oia|gfam|pfaa00855 



idisBgavkp — p. 
:3pi|!tptsv Kl 



The pattern of expression of this gene and its family members, and its similarity to the 
hepatoma-derived growth factor related protein— like protein family of genes suggests that it 
may function as a hepatoma-derived growth factor related protein— like protein in the tissues 
of expression. Therefore it is implicated in disorders mvolving these tissues. Some of the 
diseases include, but are not limited to. Endometriosis, FertiKty Anemia, Ataxia-telangiectasia, 
Autoimmune disease. Immunodeficiencies Systemic lupus erythematosus, Asthma, 
Emphysema, Scleroderma Von Hippel-Lindau (VHL) syndrome, Alzheimer's disease, Stroke, 
Tuberous sclerosis, hypercalceimia, Parkinson's disease, Huntington's disease. Cerebral palsy. 
Epilepsy, Lesch-Nyhan syndrome, Multiple sclerosis, Leukodystrophies, Behavioral disorders, 
Addiction, Anxiety, Pain, Neuroprotection Hemophilia, Hypercoagulation, Idiopathic 
thrombocytopenic purpura. Graft versus host Hirschsprung's disease, Crohn's Disease, 
Appendicitis, Cancer, and other diseases and disorders. Family members are known to 
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Stimulate endothelial cell mitogenesis, and be involved in nephrogenesis, therefore this novel 
gene may also be involved in these activities and therapeutic applications derived from these 
activities. 

The expression pattern, map location and protein similarity information for the 
invention suggests that this gene may function as "Hepatoma-Derived Growth Factor Related 
Protein". Therefore, the nucleic acids and proteins of the invention is useful in potential 
therapeutic appUcations impUcated in Endometriosis, FertiUty Anemia, Ataxia-telangiectasia, 
Autoimmune disease, Immunodeficiencies Systemic lupus erythematosus. Asthma, 
Emphysema, Scleroderma Von Hippel-Lindau (VHL) syndrome, Alzheimer's disease, Stroke, 
Tubeaous sclerosis, hypercalcdmia, Parkinson's disease, Huntington's disease. Cerebral palsy. 
Epilepsy, Lesch-Nyhan syndrome, Multiple sclerosis. Leukodystrophies, Behavioral disorders. 
Addiction, Anxiety, Pain, Neuroprotection Hemophilia, Hypercoagulation, Idic^athic 
thrombocytopenic purpura. Graft vesus host Hirschsprung's diselse, Crohn's Disease, 
Appendicitis, Cancer, endothelial cell mitogenesis, nephrogenesis, and other diseases and 
disorders. 

Potential therapeutic uses for the invcntion(s): Protein therapeutic, small molecule drug 
target, anUTwdy target (therapeutic, diagnostic, drug tmrgeting/cytotoxic antibody), diagnostic 
and/or prognostic marker, gene therapy (gene delivery/gene ablation), research tools, tissue 
regeneration in vitro and in vivo (regeneration for aU these tissues and cell types composing 
tiiese tissues and cell Ij^es derived fiiom these tissues). 

The nucleic acids and proteins of the invaition are useful in potential therapeutic 
appUcations implicated in, for example, but not limited to, Endometriosis, Fertility Anemia, 
Ataxia-telangiectasia, Autoimmune disease, Lmmunodeficiencies Systemic lupus 
erythematosus, Asthma, Emphysema, Scleroderma Von Hippel-Lindau (VHL) syndrome, 
Alzheimer's disease, Sbioke, Tuberous sclerosis, hypercalceimia, Paririnson's disease, 
Huntington's disease, Cerebral palsy, Epilepsy, Lesch-Nyhan syndrome. Multiple sclerosis, 
LeukodystiBphies, Behavioral disorders. Addiction, Anxiety, Pam, Neuroprotection 
Hemophilia, Hypercoagulation, Idiopathic thrombocytopenic purpura, Graft vesus host 
Hirschsprung's disease, Crohn's Disease, Appendicitis, Cancer, endothelial cell mitogenesis, 
nephrogenesis, and othsr diseases and disorders. For example, a cDNA encoduig the 
hepatoma-dwived growth factor related protein— like protein may be useful in gene tiierapy, 
and the hepatoma-derived growth factor related protein— like protein may be useful when 
administered to a subject in need thereof By way of non-limiting example, the compositions 
of tiie present invention will have efficacy for treatment of patients suffering from the 
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pathologies described above. The novel nucleic acid encoding llie hepatoma-derived growth 
factor related protein— like protein, and the hepatoma-derived growth fector related protein- 
like protein of the invention, or fragments thereof, may further be useful in diagnostic 
applications, wherein the presence or amount of the nucleic acid or the protein are to be 
assessed. 

These materials are fiirther useful in the generation of antibodies that bind immuno- 
specificaUy to the novel N0V9 substances for use in therapeutic or diagnostic methods. These 
antibodies may be generated according to methods known in the art, using prediction from 
hydrophobicity charts, as described in the "Anti-NOVX Antibodies" section below. For 
example the disclosed NOV9 protein has multiple hydiophilic regions, each of which can be 
used as an immunogen. In one embodiment, a contemplated N0V9 epitope is from about 
amino acids 5 to about amino acid 60. In another embodiment, aN0V9 epitope is from about 
amino acids 65 to 1 10. In additional embodiments, N0V9 epitdjpes are from about amino 
acids 1 15 to 500 and from about amino acids 520 to 680. These novel proteins can also be 
used to develop assay systems for functional analysis. 

NOVIO 

A disclosed novel NOV 10 nucleic acid of 2349 nucleotides long (also referred to as 
95073892_EXT_REVCOMP) is shown in Table lOA (SEQ ID NO:25). An ORF begins Avith 
an ATG initiation codon at nucleotides 1-3 and ends with a TGA codon at nucleotides 2347- 
2349. The start and stop codons are in bold letters in Table lOA. 
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Table lOA. NOVIO Nucleotide Sequence (SEQ ID NO:25) 

ATGGTTATCATGTCGGAGTTGAGCGCGGACCCCGCGGGCCAGGGTCAGGGCCAGCAGAAGCCCCTCCGGG 

TGGGTTTTmCGACATCGAGCGGACCCTGGGCAAAGGCAACTTCGCGGTGGTGAAGCTGGCGCGGCATCG 

AGTCACCAAAACGCAGGTTGCAATAAAAATAATTGATAAAACACGATTAGATTCAAGCAATTTGGAGAAA 

ATCTATCGTGAGGTTCAGCTGATGAAGCTTCTGAACCATCCACACATCATAAAGCTTTACCAGGTTATGG 

AAACAAAGGAGATGCTTTACATCGTCACTGAATTTGCTAAAAATGGAGAAATGTATTATTTGACTTCCAA 

CGGGCACCTGAGTGAGAACGAGGCGCGGAA6AAGTTCTGGCAAATCCTGTCGGCCGTGGAGTACTGTCAC 

GACCATCACATCGTCCACCGGGACCTCAAGACCGAGAACCTCCtGCTGGATGGCT^CATGGAaVTCA^ 

TGGCAGATTTTGGATTTGGGAATTTCTACAAGTCAGGAGAGCCTCTGTCCACGTGGTGTGGGAGCCCCCC 

GTATGCCGCCCCGGAAGTCTTTGAGGGGAAGGAGTATGAAGGCCCCCAGCTGGACATCTGGGTAGGCCTG 

'ggcgtggtgctgtacgtcctggtctgcggttctctgcccttcgatgggcctaacctgccgacgctgagac 

AGCGGGTGCTGGAGQGCCGCTTCCGCATCCCCTTCXTCATGTCTCAAGACTGTGAGAGCCTGATCCGCCG 

gatgctggtggtggaccccgccaggcgcatcaccatcgcccagatccggcagcaccggxggatgcgggct 
gagccctgcttgccg6gacccgcctgccccgccttctccgcacacagctacacctccaacctgggcgact 
acgatgagcaggcgctgggtatcatgcagaccctgggcgtggaccggcagaggacggtggagtcactgca 

AAACAGCAGCTATAACCACTTTGCTGCCATTTATTACCTCCTCCTTGAGCGGCTCAAGGAGTAT^^ 

gcccagtgcgcccgccccgggcctgccaggcagccgcggcctcggagctcggacctgagtggtttggagg 
tgcctcaggaaggtctttgcaccgaccctttccgacctgccttgctgtgcgcgcagccgcagaccttggt 
gcagtccgtcgtcgaggcggagatggactgtgagctccagagctcgctgcagcccttgttcttcccggtg 
gatgccagctgcagcggagtgttccggccccggcccgtgtccccaagcagcctgctggacacagccatca 
gtgaggaggccaggcaggggccgggcctagaggaggagcaggacacgcaggagtccctgccgagcagcac 

GGGGCGGAGGCACACCCTGGCCGAGGTCTCCACCCGCCTCTCCCCACTCACCeCGCCATGTATAGTCGTC 

tccccctccaccacggcaagtcctgcagagggaaccagctctgaousttgtctgaccttctctgcgagca 

AAAGCCCCGCGGGGCTCAGTGGCACCCCGGCCACTCyySKSGGCTGCTGGGCGCCTGCTCCCCGGTCAGGC^ 

ggcctcgcccttcctggggtcgcagtccgccaccccagtgctgcaggctcaggggggcttgggaggagct 
gttctgctccctgtcagcttccaggagggacggcgggcgtcggacacctcactgactcaagggc^cgaagg 
ccttrcggcagcagctgaggaagaccacgcggaccaaagggtttctgggactgaacaaaatcaaggggct 

GGCTCGCCAGGTGTGCGAGGCCCCGGCGAGCCGG6CCAGCAGGGGCGGCCTGAGCCCCTTCCACGCCCGT 
GCACAGAGCCCAGGCCTGCACGGCGGCGCAGCCGGCAGCCGGGAGGGCTGGAGCCTGCTGGAGGAGG1?GC 
TAGAGCAGCAGAGGCTGCTCCAGTTACAGCACCACCCGGCCGCTGCACCCGGCTGCTCCCAGGCCCCCGA 
GCCGGCCCCTGCCCCGTTTGTGATCGCCCCCTGTGATGGCCCTGGGGCTGCCCCGCTCCCCAGCACCCTC 
CTCACGTCGGGGCTCCCGCTGCTGCCGCCCCCACTCCTGCAGACCGGCGCGTCCCCGGTGGCCTCA6CGG 
CGCAGCTCCl^GGACACACACCTGCACATTGGCACCGGCCCCACCGCCCTCCCCGGTGTGCCCCCACCACG 
CCTGGCCAGGCTGGCCCCAGGTTGTGAGCCCCTGGGGCTGCTGCAGGGGGACTGTGAGATGGAGGACCTG 
ATGCCCTGCTCCCTAGGCACGTTTGTCCTGGTgCAGTGA ^ . 



A disclosed NOVIO protein encoded by SEQ ID NO:25 has 782 amino acid residues, 
and is presented using the one-letter code in Table lOB (SEQ ID NO:26), The SignalP, Psort 

5 and/or Hydropathy profile for NOVIO predict that NOVIO has no signal peptide and is likely 
to be localized at the endoplasmic reticulum (membrane) with a certainty of 0.6000; the 
micxobody (peroxisome) with a certainty of 0.3000; the mitochondrial inner membrane with a 
certainty of 0.1000; and the plasma membrane with a certainty of 0.1000. The disclosed 
NOVIO protein is similar to the SNFl/AMPK family, some members of which show nuclear 

10 localization. Therefore, it is likely ttiat this novel human salt-inducible protem kinase-like 
protein is available at the appropriate sub-cellular localization and hence is accessible for the 
therapeutic uses described herein. 

The disclosed NOVIO sequence was initially identified by searching CuraGen*s 
Human SeqCalling database for DNA sequences which translate into proteins with similarity 

1 5 to the protein kinase protein family. SeqCalling assembly 95073892 was identified as having 
suitable similarity. SeqCalUng assembly 95073892 has seven components. This assembly 
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was analyzed further to identify open reading firaine(s) encoding for a novel full-length protein 
by extending the SeqCalling assembly using (i) siiitable additional SeqCalling assemblies, (ii) 
publicly available EST sequences, as well as (iii) public genomic sequences* 

Two genomic clones, GenBank Accession Numbers AP001046 and ACOl 2140 were 
5 identified as having regions with 100% identity to the SeqCalling assembly 95073892 and 
were selected for analysis because this identity implied that these clones contained the 
sequence of the genomic locus for Ms SeqCalUng assembly. 

The genomic clones were analyzed by Genscan and Grail to identify exons and 
putative coding sequences/open reading jframes. These clones were also analyzed by TblastN, 

1 0 BlastX and other homology programs to identify regions translating to proteins with similarity 
to the original protein/protein fiamily of interest. This was found to reside in the following 
genomic clone regions: m AC001046 from nucleotide 149360-149735, 150161-150392, 
150878-151159, 151639-151855, 151974-152096, 152477-152623, 152852-153075, 153628- 
153750, 153857-153985, 154256-154417, 154595-154655, and in AC012140 from nucleotide 

15 50609-50725, 51225-51380. 

The results of these analyses were integrated with SeqCalling assembly information 
and manually corrected for apparent inconsistencies, thereby obtaining the sequences encoding 
the full-length cDNA and protein. When necessary, the process to identify and analyze 
cDNAs/ESTs and genomic clones was reiterated to derive the full-length sequence. This 

20 invention describes this full-lengfli DNA sequence(s) and their splice forms and the full-length 
protein sequenc^(s) that they encode. These nucleic acids and protein sequences for each, 
spUce form are referred to here NOVIO. 



Table lOB. Encoded NOVIO protein sequence (SEQ ID NO:26). 

MVlMSEFSADPAGQGQGQQKPLRVGFYDIERTLGKGNFAWKIiARHRVTKTQVAlKIIDKTRLDSSNL 

EKIYREVQliMKLLNHPHIIKLYQVMBTKI»4LYIVTEFAKNGEMYYLTSNGHLSENEARKKFWQI 

EYCHDHHIVHRDLKTENLLLDGNMDIKLADFGFGNFYKSGEPLSTWCGSPPYAAPEVFEQKSYEGPQL 

DlWVGLGVVLYVLVCGSLPFDGPNLPTLRQRVLEGRFRIPFFMSQDCESLIRRMtjVVDPARRlTIAQI 

RQHRWMRAEPCLPGPACPAFSAHSYTSNLGDYDEQALGIMQTLGVDRQRTVESLQNSSYNHFAAIYYL 

LLERLKEYRNAQCARPGPARQPRPRSSDLSGLEVPQEGLSTDPFRPAIiLCPQPQTLVQSVLQAEMDCE 

LQSSLQPLFFPVDASCSGVFRPRPVSPSSLLDTAISEEARQGPGLEEEQDTQESLPSSTGRRHTLAEV 

STRLSPLTAPCIVVSPSTTASPAEGTSSDSCLTFSASKSPAGLSGTPATQGLLGACSPVRLASPFLGS 

QSATPVLQAQGGLGGAVLLPVSFQEGRRASDTSLTQGLKAFRQQLRKTTRTKGFIiGLNKIKGLARQVC 

QAPASRASRGGLSPFHAPAQSPGLHGGAAGSREGWSLLEEVLEQQRLLQLQHHPAAAPGCSQAPQPAP 

APFVIAPCDGPGAAPLPSTLLTSGLPLLPPPLLQTGASPVASAAQLLDTHLHIGTGPTALPAVPPPRL 

ARLAPGCEPLGLLQGDCEMEDIiMPCSLGTFVLVQ 

25 PCR-coupled cDNA subtraction hybridization was ad^ted to identify the genes 

expressed in the adrenocortical tissues fi*om high salt diet-treated rat. A novel cDNA clone, 
termed salt-inducible kinase (SIK), encoding a polypeptide (776 amino acids) with significant 
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similarity to protein serine/threonine kinases in the SNFl/AMPK family was isolated. An in 
vitro kinase assay demonstrated that SIK protein had autophosphoiylation activity. Northern 
blot revealed that SIK mRNA levels were markedly augmented by ACTH treatment both in rat 
adrenal glands and in Yl cells. Thus, SIK may play an important role in the regulation of 
5 adrenocortical functions in response to high plasma salt and ACTH stimulation. See Wang et 
aL, FEES Lett 453:135-39 (1999). 

The gene encoding the novel hxmian salt-inducible protein kinase-like protein of this 
invention maps to chromosome 21 between markers MX1-D21S171. 

The human salt-inducible protein kinase-like protein disclosed in this invention was 
10 found to be e?q>ressed in the endocrine system (for example, adrenal gland/supradrcnal gland), 
and in the urinary system (for example, kidney)* In addition, the rat and mouse homologs of 
this gene are expressed in the nervous system (for example, brain) and in the cardiovascular 
system (for example, heart). Therefore, it is likely that the gene Encoding the novel human 
salt-inducible protein kiiiase-like protein of this invention (t^., the gene encoding the NOVIO 
1 5 polypeptide) is also expressed in these tissues in humans. 

Patp results for NOVIO include those listed in Table IOC. 



Table IOC Patp aligmnents of NOVIO 


Sequences prociucing High-gcoring segment Pairs ; 




Smalleat 
Suiu 
Prob, 


Reading 
Frame 


High 
Score 


Patp:W90878 Human keratinocyte derived pKe#l22 protein #1 ..+1 
Patp:W90879 Human keratinocyte derived pK3#122 protein #2 ..+1 
patp :B3 6283 Human protein fragment PN765 — Homo Sapiens . +1 


776 
776 
209 


0.0 

0.0 

2.7e-108 



For example, a BLAST against W90878, a 790 amino acid regulatory polypeptide jBrom 
20 Homo sapiens, produced 776/783 (99%) identity, and 777/783 (99%) positives (E ^ 0.0), with long 
segments of amino acid identity, as shown in Table 1 CD. See WO 00/1 7232-Al . 



Table lOD. Blast Results of NOVIO and W90878 (SEQ ID NO:90) 


Score = 


4022 (1415.8 bits). Expect - 0.0, P 0.0 




Identities « 776/783 (99%), Positives ^ 777/783 (99%), Frame = +1 




NOVIO: 


1 ^f^IMSEFSADPAGQGQGQQKPLRVG^YDIERTLGKGNFAVVKIiARHRVTKTQVi^ 


60 




ititiiiiitiiii iiiiiiitiiinMiitiitiiiiiiiiiiiiiiii liiiiii 




W90878: 


8 MVIMSEFSADPAGQSQGQQKPLRVGFYDIERTLGKGNFAWKLARHRVTKTQNAIKIIDK 


67 


NOVlO: 


61 TRLDSSNLEKIYREVQIJ^LLNHPHIIKLYQVMETKDMLYrVTEFAKNGEMY-YL^ 


119 




itMiiiiitiiiiiiiiiiiiiriiiiiniiiiiititiiiiiiiiiii'i- MM Ml 




W90878; 


68 TRLDSSNLEKIYREVQLMKLLNHPHIIECLYQVMETKDMLYIVTEFAKNGEMFDYLTSNGH 


127 


NOVIO: 


120 LSENEARKKFWQILSAVEYCEDHHIVHRDI4C1-B;NLLLDGNMDIKIADFGFGN 17 9 
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W90878: 


128 


1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ) 1 1 11 1 1 1 1 1 M 1 1 i 1 1 

LSENEAHKKFWQILSAVEYCHDHHIVHRDLKTENLUjDGKMDIKLADFGFGNFYKSGEPL 


187 


NOVIO; 


180 


STWCGSPPYAAPEVFEGKEYEGPQLDIWVGLGWLYVLVCGSLPFDGPNLPTLRQRVLEG 

IIIMNIIIIltlllltiniMIII! Itllllllllllltlllllllllillllll 

^*V^Ca^PP Y A A PE VFE GKE YE GPOLDTM - S LG WL YVLVCG S L P FD G PNL PT LRQR VLEG 


238 


UQnP'7 R • 
W jUO / 0 i 




246 


NOVlO: 
W9Q878: 


239 
247 


RFRIPFFMSQDCESLIRRMLWDPARRITIAQIRQHRWMRAEPCLPGPACPAFSAHSYTS 

lllllllllllillilllllllltltlllMIIIIMIIMtllllllllllllllllil 

RFRIPFFM3QDCESLIRRMLVVDPARRITIA0IR0MRWMRAEPCLPGPACPAFSAHSYTS 


298 
306 


NOVIO ■ 


299 


NLGDYDEQALGIMQTLGVDRQRTVESLQNSSYNHFAAIYYLLIiERLKEYRNAQCARPGPA 

iMiiiiMiiiiiiiiiiniiiiiiiiiiiiiiiiiiiiiitiiiiiiiiiiiiiiii 

NLGDYDEQALGIMQTLGVDRQRTVESLQNSSYNHFAAtYYLLliERIiKEYRNAQCARPGPA 


358 


W90878! 


307 


366 


NOVlO: 


359 


RQPRPRSSDLSGLEVPQHIGLSTDPFRPALI^CPQPQTLVQSVLOAEMDCELQSSLQ-PLFF 

tllllllllltllDltllllllltlMIIIIIIIIIMIilllllMttMM) nil 

RQPRPRSSDLSGLEVPQEGLSTDPFRPALLCP0PQTLVQSVLQAEMDCELQS3LQWPLFF 


All 


W90878: 


367 


426 


NOVlO: 
W9087B: 


418 
427 


PVDASCSGVFRPRPVSPSSLLDTAISEEARQGPGLEEEQDTQESLPSSTGRRHTr*AEVST 

iitiiiiiiiitiiiiitiiiiiMiiiiiiiiiiiiiiiitniiiiiiiitiiinii 

PVDAS C SGVFRPRPV3 P S SLLDT AI 3EEARQ6PGLEEE QDTQE S LPS STGRRHTLASV3T 


477 
486 


NOVlO: 


478 


RLSPLTAPCIWSPSTTASPAEGTSSDSCLTFSASKSPAGLSpXPATQGLLGACSPVRLA 

1 1 III! lit III 11)1 11 iiiiiiiiiiii lit liiiiitiniiitiii nil mill 

RLSPLTAPClWSPSTTASPAEGTSSDSCLTFSASKSPAGLSGTPATQGLLGACSPVRIiA 


537 


W9b87B: 


487 


546 


NOVlO: 
W90878: 


538 
547 


SPFLGSQSATPVLQAQGGLGGAVLLPVSFQEGRRASDTSLTQGLKAniQQLRKTTRTKGF 

IMIIIllllllltllMlllilllllllllllllinillllllllllllllllllill 

SPFLGSQSATPVXiQAQGGLGGAVLLPVSFQEGRRASDTSLTQGLKAFRQQtiRKTTRTKGF 


597 
606 


NOVlO; 
W90B78: 


598 
607 


liQLNKIKGLARQVCQAPASRASRGGl^a^FHAPAQSPGriHGGAAGSREGWSIiLEEVLEQQR 

1) 1 1 1 II 1 1 II 1 ) 1 1 1 1 II t n 1 1 1 1 II 1 1 II II n II 1 1 1 1 1 1 1 1 1 1 II II 1 1 ) 1 1 1 1 

LGLNKIKGLARQVCQVPASRASRGGLSPFHAPAQSPGLHGGAAGSREGWSLLEEVLEQQR 


657 
666 


NOVlO: 
W90878: 


658 
667 


LLQLQHHPAAAPGCSQAPQPAPAPFVIAPCDGPGAAPLPSTLrjTSGLPLLPPPLLQTGAS 

IMIIIIItllllllllllMlllllllltllMlllllllltllMlitlllllltlll 

LLQLQHHPAAAPGCSQAPQPAPAPFV1APCDGPGAAPLPSTLLT3GLPLLPPPLLQTGAS 


717 
725 


NOVlO: 


718 


PVASAAQLLDTHLHIGTGPfAIiPAVPPPRLARLAPGCEPIiGLLQGDCEMEDLMPCSLGTF 

) 1 1 II II 1 1 1 1 11 ] 1 1 1 1 1 II 1 1 II 1 1 1 11 i II II 1 1 Ml 1 1 1 1 1 1 1 1 1 1 1) 1 1 1 1 1 1 1 1 

PVASAAQLLDTHLHIGTGPTALPAVPPPRIARLAPGCEPLGLLQGDCEMEDLMPC3LGTF 


111 


W90878: 


727 


786 


NOVlO: 


778 


VLVQ 781 

Mil 




W90878: 


787 


VLVQ 790 





Additionally, NOVlO also showed a large degree of homology with W90879, an 823 
amino acid regulatory polypeptide from Homo sapiens. Specifically, a BLAST produced 
776/783 (99%) identity, and 777/783 (99%) positives (E=0,0), with long segments of amino 
5 acid identity, geg WO 00/17232-A2. 

A BLAST against B36283, a 213 amino acid human protein firagment from Homo 
sapiens, produced 209/212 (98%) identity and 209/212 (98%) positives (E=2.7e-108), with 
long segments of amino acid identity. See WO 00/65340-Al. 

The disclosed NOVlO protein (SEQ ID NO:26) has good identity with a number of 
10 kinase proteins. The identity information used for ClustalW analysis is presented in Table 
lOE. 
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Table lOE. BLAST results for NOVIO 






Gene Index/ 
Identifier 


Protein/ Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


Gaps 


<5iI9978891| 

P57059I 
(AP001751) 


SNIL HUMAN 
PROBABLE SERINE/ 
THREONINE KINASE 
SNFILK 
Homo sapiens 


786 


670/797 
(85%) 


671/787 
(85%) 


0.0 


6/787 
(0%) 


01112 643489 
1 3p|Q9R105 
(AB020480) 


skil rat probable 
"serine/ 

kinase snfilk 
(salt-inducible 
protein kinase) 

f PROTEIN KINASE 
KID2) 
Rattua norvegicua 


776 


561/787 
(71%) 


591/787 
(741) 


0.0 


16/787 
(2%) 


Gl I IXOo /42o 

Irefl 
NP 067723.1 
(AF106937) 


Salt-inducible 
protein kinase 
Rattus xiorveglcus 


776 


560/787 
(71%) 


591/787 
(74%) 


u , 0 


16/787 
(2%) 


Gil 67547461 

rex 1 
NP 034961.1 
7011494) 


Myocardial SNFl- 
like kinase 
Mus mvaculiis 


779 


554/790 
(70%) 


^88/790 
(74%) 


0.0 


19/790 


Git 67604361 

gbi 

AAF28351.1i 
AF219232 1 
(AF2 19232) 


Gin-induced kinase 
Gallus gallus 


798 


472/803 
(58%) 


540/803 
(66%) 


0.0 


26/803 
(3%) 



This information is presented graphically in the multiple sequence alignment given in 
Table lOF (with NOVIO being shown on hne 1) as a ClustalW analysis comparing NOVIO 
5 with related protein sequences. 



10 
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Table lOF Information for the ClustalW proteins: 

1) NOVIO (SEQE)NO:26) 

2) git9978891|sp|P57059|SNIL_HUMAN PROBABLE SERINE/THREONTNE PROTEIN KINASE SNFILK: (SBQ 
IDN0:91) 

3) gi|126434891splQ9RlU5|SNlL RAT PROBABLE SERINB/THREONINE PROTEIN KINASE SNFILK (SEQ 
n:>NO:92) 

4) gi|l 1067425]reflNP_067725. l|salt-inducible protein kinase {Rattus norvegicus) (SBQ ID NO:93) 

5) gi|6754746|rcflNP_03496l.ilmyocaftlial SNFl-like kinase [Mt4s musculus) (SEQ ID NO:94) 

6) gi|6760436|gblAAF28351.1|AF219232_l (AF219232) (C7fl//j«ga//uj)(SEQ IDNO:95) 



wovio 

gl I 9978891 1 
gl 1 12643489 1 
gi 1 11067425 1 
gi I 67547461 
glj 67604361 



NDVlO 

gl I 9978891) 
gl 1 126434891 
gli 11067425 1 




70 



SO 
. I.. 



90 



100 



110 



120 



TRLD33UL:- ::iV}^ivQU:r:iJ rUii^BilKlY liVHETX 
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rUL OS .SflLBKIYH 


EVQLMKLLNHlgU 


KLlLA't 






KNCSMFCYL 



















gi I 67547461 
91157604361 



ROVIO 

gi I 99788911 
gi 1 12643489 1 
gi 1 11067425 1 
gi| 67547461 
gil 6760436 1 



130 



.1. 



140 



150 



■ I . 



160 
I.. 



.1. 



L S E W t: A RK K FVIOL L S/W E Y l HTM i 1 1 \4 i P. D L >:T EN LLL DCA-IH 3 1 K L/v 
L S E N E: AR F\ K FWQ I L S A V E V C t if? H H i V H K DL ^.l' ".LI., 1 M M D X K LA 
bS h. N t:A KK. K 1: W Q i L A V h Y 'J iiS ! 1 1 ; I. V 1 1 1 ( L L i' I:; iN ! L 1. UGH H \ ) I" K l.I- 

I,; ; L N LIAR K K (j I i . /\ v t, y c i iij 1 1 1 i i v i i k r j i , v'l- 1-^ m t ,t , r , nr; i-) m n i k i. : a 

T . S F I'T E AF fWQ I L S A V R YC I |S H H 1 V H K DL KT EM LL L D^A H D I KiJ\ 



3TE 








190 


200 210 


220 


230 


240 


HOVIO 


L:t'L;:.'rwcjc 






Lt.'VV LYV.:,'\. 


^,x^;-;.i.A'eege: 


I.E-'IMrRi^R; 




git 99786911 




3PPY^\ 


-YPEV^'jlGKEYEGPnLLIWSS 


LC-VVLYVIi'. 


'CGSLtPECGP: 


LFTLRQRV 




gi 1126434891 


EPLSTWCC 


,3PPYA 


\ P E V F 2 G E Y E G P 0 L R J: ViJ R ! 


lgvvlyvl; 


'::G3lpfcgpi 


LFTLRCR'. 




gi 111067425 1 


Er LS i'v/CC 


5PPYA 


\ e E V EE GEE V EG ? .OL D I tV j 


fiij'^ ^'iiV V"Lu 


'CiGSLL'FDCl': 


LPTERORV 




gi] 67547461 


eplstSJc 


r>PPYA 


n V EE G E E Y E G E L^W S j 




CGSI.PELG 


LrTLRCR'-/ 




gi 167604361 


EL^LSTWCC 




■.i:^EVELi;GKEYi'i";pJ]T,r]rw:.| 


Li^VVL.YV]/. 


'GGir.pi.'i'Gr: 


l.-E'n,RCRv 





HOVIO 

gi I 99788911 
glj 12643489 1 
gi) 11067425 1 
gi I 67547461 
gil 67604361 



NOVIO 

gil 99788911 
gi| 126434891 
gil 11067425 1 
gil 6754746 1 
gil 67604361 



NOViO 

git 99788911 
gi {126434891 
gil 110674251 
gil €7547461 
gil 67604361 



NOVIO 

gil 99788911 
gi|126434B9| 
gil 110674251 
gil 6754746} 
gil 67604361 



MOVIO 

gil 99788911 
gil 12643489 1 
gil 11067425 1 
gil 67547461 
gil 67604361 



NCVlO 

gl 19978891 I 
gil 12643489 1 
gil 11067425 I 
gil 67547461 
gil 67604361 



MOVIO 



300 




430 

.1- 



440 



450 



460 



470 



480 
..| 
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10 



gi 1 99786 91 1 
gi|12€434B9I 
1 11067425) 
gi| 6754746) 
gi 1 6760436 1 



HOVIO 

gil 9978691) 
gi|12643489| 
git 11067425) 
gi|6754746) 
gi| 67604361 



^10 

gl 1 99788911 
gi 1 12643489 1 
gl I 11067425 t 
gi 1 6754746) 
gl) 6760436} 



NOVIO 

gl) 99788911 
gl) 12643489 1 
gl) 11067425) 
gl] 67547461 
gl| 67604361 




670 



680 



€90 




700 
I. 



710 



720 



P— OTSPGCS 
IP— HUpGCS 

js — ivasDyQ 
[sstaBSsgcc ^ 

iLLQpfcpQT&frSA: 





;MSLL1SELQRENS 
770 



780 

1 
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The presence of identifiable domains in NOVIO was determined by searches using 
algorithms such as PROSITE, Blocks, Pfam, ProDomain, Prints and then detmnining the 
Interpro number by crossing the domain match (or numbers) using the Interpro website 
(http:www.cbi.ac.uk/interpro/). 

DOMAIN results for NOVIO were collected from the Conserved Domain Database 
(CDD) with Reverse Position Specific BLAST. This BLAST samples domains found in the 
Smart and Pfam collections. The results are listed in Table lOG with the statistics and domain 
descriptiotL The results indicate that this protein contains the following protem domains (as 
defined by Interpro) at the indicated positions: serine/threonine protein kinases, catalytic 
domain (at amino acid positions 27-278); pkinase, eukaryotic protein kinase domain (at amino 
acid positions 11 "21^); tyrosine kinase, catalytic domain (at amino acid positions 29-274); 
RIO-like kinase (at amino acid positions 32-167). This indicates that the sequence of NOVIO 
has properties stmilar to those of other proteins known to contain this domain and similar to 
the properties of this domain. 
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Table lOG, DOMAIN results for NOVIO 



Domain 


Name 


Score 
(bits) 


£ Value 


Gnl 1 smart [ S^Tkc 


Serine/Threonine protein 
kinases^ catalytic domain; 
Phosphotransferases . 
Serine or threonine- 
gpecific kinase 


279 


3e-76 


Gn 1 1 Pf am 1 p f amO 0 0 6 9 


Pkinaae, Eukaryotic protein 
kinase domain 




6e-67 


Gnl 1 Smart I TyrKc 


Tyrosine kinase, catalytic 
domain; 
Phosphotransferases . 
Tyrosine-specific kinase 
subfamily. 


144 


2e-35 


Gnl 1 Smart 1 RIO 


RIO-like kinase 


36*6 


0.005 



For example, the results of a BLAST of NOVIO against ^l|Smart|S_TKc (SEQ ID 

k . 

NO;96) are shown in Table lOH. 

Table lOH, BLAST of NOVIO against gnl |Smart|S^TKc ~ 

CD-Length =• 256 residues, 100.0% aligned 

Score ^ 219 bits (714), Expect = 3e"76 



HOVIO 

Gnl I Ssm^t I SJEKo 



NOVIO 

Gnl I Smart 1 S_TKc 



MOVlO 

GnX|Smart|S_TKc 



HOViO 

GaltSaaEtlS^TKa 



QnllSinartiaTKo 




The similarity information for the NOVIO protein aad nucleic acid disclosed herein 
suggest that NOVIO may have important structural and/or physiological fimctions 
characteristic of the protein kinase family and the NOVIO family. The expression pattern, 
m^ location, and protein similarity information for the invention suggest that the human salt- 
inducible protein kinase-like protein described in this invention may function as a protein 
kinase. 

NOVIO has been ^alyzed for tissue expression profiles using the methods described 
for in the Examples. Various collections of samples are assembled on the plates, and referred 
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to as Panel 1 (contaiiiing cells and cell lines iBrom normal and cancer sources), Panel 2 
(contalmng samples derived from tissues, m particular from surgical samples, fcom nonnal 
and cancer sources), Panel 3 (containing samples derived &om a wide variety of cancer 
sources) and Panel 4 (containing cells and cell lines from nonnal cells and cells related to 
5 inflammatory conditions). TaqMan oUgo set Agl542 for the NOVIO gene include the forward 
probe and reverse oligomers shown in Table 101 



Table 101. TaqMan oligo set Agl 542 


Primers 


Sequences 


SEQiONO: 


Forward 


5 ' -CTATCGTGAGGTTCAGCTGATG-'3 ' 


97 


Probe 


FAM-5 ' -AAGCTTCTGAACCATCCACACATCAT-3 ' -TAMRA 


98 


Reirsrse 


5 ■ -CCTTTGTTTCCATAACCTGGTA-3 ' 


99 



1 0 TaqMan oligo set Ag2369 for the NOVl 0 gene include the forward probe and reverse 

oligomers shown in Table lOJ. 



Table lOJ. TaqMan oligo set Ag2369 


Primers 


Sequences 


SEQ ID NO: 


Forward 


5 ' -TCAGCTGATGAAGCTTCTGAAC-3 ' 


100 


Probe 


FAM-5 ' -CATCCACACATCATAAAGCTTTACCAGG-3 ' -TAMRA 


101 


Reverse 


5 ' -CGATGTAAAGCATGTCCTTTGT-3 ' 


102 



The results of the TaqMan expression profile of transcript with these probes are shown 
15 below in Tables lOK-lON. Specifically, for Panel 13, the expression of Agl542 is innoimal 

adipose, ovary, lung, and trachea. It is also highly expressed in one renal tumor. For Panel 2, 

Most normal tissue and tumor margins do not express appreciable levels of this tnmscript. 

The highest levels are in the TCC 3. For Panel 4D, small airway epithelium expresses very 

low levels of this transcript unless it is activated with TNF alpha/IL-1, which increases 
20 expression greater than four-fold. Lymphokme activated killer cells (LAK cells) also 

upregulate this transcript greater than twelve-fold when treated with PMA and ionomycin. 
This transcript is up-regulated in small airway epithelium stimulated with 

proinflammatory cytokines and in activated LAK cells suggesting that it may be mvolved in 

the inflammatory process in these two tissues. Blocking the action of this molecule with 
25 antibody or small molecule therapeutics may reduce or eliminate inflammation in diseases 

which target the small airway epithelium such as allergy/asthma and viral infections. 

Reducing the activity of this molecule in LAIC cells during transplantation may prevent organ 

rejection. 

no 
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Table lOK. TaqMan Results, Probe Agl542 (Panel 1.3) 





% Kelative 


Tissue Name 


Hxpression 


Liver adenocarcinoma 


27.9 


Heart (fetal) 


38.7 


Pancreas 


2.6 


Pancreatic ca.CAPAN 2 


2.6 


Adrenal gland 


16,5 


Thyroid 


4.6 


Salivary gland 


L9 


Pituitary gland 


9,7 


Brain (fetal) 


2.9 


Brain (whole) 


2.1 


Brain (amygdala) 


3-6 


Brain (cerebellum) 


1.0 


Brain (hippocampus) 


20.2 


Brain (thalamus) 


3,3 


Cerebral Cortex 


t 10.7 


Spinal cord 


6.3 


CNS ca. (gUo/astro) U87-MG 


6.2 


CNSca. (gUo/astro)U-ll8-MG 


7.1 


CNSca. (astro) SW1783 


10.6 


CNS ca> (neuro; met )SK-N-AS 


6.0 


CNS ca. (astro) SF-539 


3.9 


CNS ca. (astro) SNB-75 


ILl 


CNS ca. (glio)SNB-19 


0.4 


CNS ca, (gUo) U251 


2.4 


CNS ca. (glio) SF-295 


2.9 


Heart 


6.1 


Skeletal muscle 


3.4 


Bone marrow 


3.7 


Thymus 


2.1 


Spleen 


16.8 


Lymph node 


6,3 


Colorectal 


13.8 


Stomach 


4.8 


Small intestine 


2,4 


Colon ca. SW480 


4,8 


Colon ca * (SW480 met)SW620 


5.9 


Colon ca. HT29 


3.2 


Colon ca. HCT-116 


3.9 


Colon ca. CaCo-2 


8.8 


83219 CC Well to Mod Diff (OD03866) 


20,0 


Colon ca.HCC-2998 


42.3 


Gastric ca.* (liver met) NCI-N87 


37.6 


Bladder 


3.2 


Trachea 


40.6 


Kidney 


1.4 


Kidney (fetal) 


14.4 


Renal ca. 786-0 


54 


Renal ca. A498 


100.0 


Renal ca»KXF 393 


13.9 


Renal ca, ACHH 


13.8 
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XxOUal Ca> KJKJ'J 1 


8.4 


Renal C5 TK'-' 10 


8.0 


Livct 


2.0 




17.0 




17.8 


T lino 


43.5 




16.2 


TjTtiff f*fl rstnall cein LX-1 


2.8 


Limeca f small cellWCI"H69 


10:3 


T.nnff ca cell var TsHP-77 


20.6 


Lunttca Clartte ccll1NCl-H460 


25.5 


I.utiff ca rtton-am cell^ A549 


63.3 




25.5 




2.0 


TiiTiffrsi /'nnn-s Nril-H522 


0.5 


T linn /'cniiam 000 


8.5 


T iiniy ra r^anaTn ^ NCI-H596 


3.9 


iViOlIUIllUjr J^UIUU 


25.0 


i3rpoSi ca. (pi. eiiusioix/Jiviv.^ - / 


13.2 


Prf^BQt PA * M eft MDA-MB-23 1 


48.6 




' 0.9 


iSrCoSt Ca. 13 l-Jty 


15.4 


i^r&HSi ca* xviiJA-iN 


0.8 


Ovftiy 


57*0 


WVaTj-Vl Ca. \J V v>AH."J 


8.4 


Ovftrinn rfl OVPAll-4 


1,9 


vJVtLriall \M*\J V Vj/\*v~-.' 


5.3 


OviiriEiti rfl OVPAR-R 


7.9 




1.4 


UVoJlaB Co* ^aSCuCS^ Ortj-KJ V 


10.8 




3.5 


JrlttUIUlUl 


15.8 




4.9 




6*1 




10*7 


WiCioUOIJLla jriSOOo^/\^. 1 


0.4 


Melanoma ^nici^ jtisooo^iaj. i 


1.0 


Melanoma UACC-62 


0.4 


Melanoma M14 


0.6 


Melanoma LOXIMVI 


2.9 


Melanoma* (met) SK-MEL-5 


2.7 


Adipose 


55.1 



Table lOL. TaqMan Results, Probe AglS42 (Panel 2D) 



% Relative 

Tissue Name Expression 

Normal Colon GENPAK 061003 17.8 

832 19 CC WeU to Mod Diff (OD03866) 8.0 

83220 CCNAT (OD03866) 21.5 

83221 CC Gr.2 rectosigmoid (OD03868) 2.2 

83222 CCNAT(OD03868) 0.4 

83235 CC Mod Diff (ODO3920) 3*3 

83236 CC NAT (ODO3920) 1 
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83237 CC Gr.2 ascend colon (OD03921) 40.1 

83238 CC NAT (OD03921) 13.9 

83241 CC from Partial Hqpatectomy (ODO4309) 1 6.4 

83242 Liver NAT (ODO4309) 3L2 

87472 Colon mets to hmg (OD0445 l-^Ol) 6.7 

87473 Lung NAT (OD04451-02) 10.8 
Normal Prostate Clontech A+ 6546-1 4. 1 

84140 Prostate Cancer (OD04410) 7.6 

84141 Prostate NAT (OD04410) 6.4 

87073 Prostate Cancer (OD04720-01) 23.5 

87074 Prostate NAT (OD04720-02) 50.4 
Normal Lung GENPAK 061010 34.2 

83239 Lung Met to Muscle (OD04286) 1 6.8 

83240 Muscle NAT (OD04286) 1 6.6 

84136 Lung MaUgnant Cancer (OD0312^ 25.5 

84137 Lung NAT (OD03126) 58.2 

84871 Lung Cancer (OD04404) 27,4 

84872 Lung NAT (OD04404) 1 6.6 
84875 Lung Cancer (OD04565) 1 8.4 
85950 Lung Cancer (OD04237-01) | 18.4 
85970 Lung NAT (OD04237-02) " 31.9 

83255 Ocular Mel Met to Liva (OD043 10) 8.5 

83256 Liver NAT (OD043 10) 37. 1 
84139 Melanoma Mets to Lung (OD04321) 5.5 

84138 Lung NAT (OD04321) 33.5 
Normal Kidney GENPAK 061008 4 J 

83786 Kidney Ca, Nuclear grade 2 (OD04338) 9.6 

83787 Kidney NAT (OD04338) 21.8 

83788 Kidney Ca Nuclear grade 1/2 (OD04339) 6.0 

83789 Kidney NAT (OD04339) 17.9 

83790 Kidney Ca, Clear cell type (OD04340) 1 8.2 

83791 Kidney NAT (OD04340) 29. 1 

83792 Kidney Ca, Nuclear grade 3 (OD04348) 1 1 .0 

83793 Kidn^ NAT (OD04348) 8.4 

87474 Kidney Cancer (OD04622-01) 19.1 

87475 Kidney NAT (OD04622-03) 7.8 

85973 Kidney Cancer (OD04450-01) 4.5 

85974 Kidney NAT (OD04450-03) 11.5 
Kidney Cancer Clontech 8120607 2.5 
Kidney NAT Clontech 8120608 5.9 
Kidney Cancer Clontech 8120613 4.8 
Kidney NAT Clontech 8120614 5.9 
Kidney Cancer Qontech 9010320 1 8.8 
Kidney NAT Clontech 90 1 032 1 1 1 .3 
Normal Utms GENPAK 061018 2.4 
Uterus Cancer GENPAK 0640 1 1 27.2 
Normal Thyroid Clontech A+ 6570-1 4.1 
Thyroid Cancer GENPAK 064010 9.6 
Thyroid Cancer INVITROGENA302152 7.3 
Thyroid NAT INVITROGEN A302153 4.6 
Normal Breast GENPAK 061019 22,4 
84877 Breast Cancer (OD04566) 17.3 

85975 Breast Cancer (OD04590-01) 13.4 

85976 Breast Cancer Mets (0004590-03) 13.4 
87070 Breast Cancer Metastasis (OD04655-05) 4.2 
GENPAK Breast Cancer 064006 4.6 
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oxeast L>anceT L^iontccn yiuu/oo 


7.4 




7.3 


Dfcast uaaccF UN ViixtucrxiJN axww t j 


4.0 


Breast JN A 1 UNVllKUOiiJN AZUi^U/Ji 


3.0 


Normal lAVCt ijrtiJNrAJv uoiuuy 


0.3 


Liver fjancer ijiiJNrAJv uo^uuj 


5.6 


Liver Dancer Keseaicn uenencs jsina iv^d 


36.6 


Liver Uanccr Kesearcn ucneucs jsina iu^d 


10.3 


Paixea Liver uancer l issue Kesearcn oeneucs jxjn a uw'*— jl 


52.1 


Paired Liver i issue Kesearcn ijenencs jsjna du\j**-in 


19.2 


Pairea Liver L'ancer i issue Kesearcn oenencs kin a cwu:>- i 


8.4 


Paired Liver x issue Kesearcn Ljencncs kin a j-in 


12.9 


Normal i5xacicieT oiiJNJr aa. uo x w a 


13.5 


25iauQeF l^yolicer IvBStTtmjii ucsuwLU^ xu 1 


5.0 


Bladder Oancer iNViiKUUiiiN aouzi /j 


5.6 


s/u/1 DiacKier ^--ancer ^wxjw/io-ui^ 


100.0 


o/\}//, j^iaoaer XNonnai Aojoccnt yy/uv** / 1 o-\jdj 


23.2 


Normal Ovary Rfis» Gen. 


21.9 


Ovanan Cancer OKNPAJs. oo4UUo 


17.9 


874^ Ovary Cancer (OD047o6-o | 


5.3 


87493 Ovaiy NAT (OD047o8-05) 


18 2 


Normal iStomacii uJiNPAK ooiui / 


3L2 


VT A ^^^-maaaI. /^^Inn^-wvli QA^A^^O 

NAl DiDmacn CiOBtecn yuou^jv 


21.6 


Gastric Cancer Ciontecn yuou^vo 


14.5 


NAT Stomach Clontcch 9060394 


41.2 


Gastric Cancer Clontcch 9060397 


11.3 


NAT Stomach Clontech 9060396 


6.4 


Gastric Cancer GENPAK 064005 


20.3 



Table lOM. TaqMan Results, Probe Agl542 (Panel 4D) 





% Relative 


Tissue Name 


Expression 


93768_Secondary Thl__anti-CD28/anti-CD3 


0.6 


93769_Sccondary Th2_anti^CD28/anti"CD3 


1.0 


93770_Secondary Trl_anti-CD28/anti'CD3 


0.8 


93573_Sccondary Thl^resting day 4-6 in IL-2 


0.0 


9i3572_Secondary Th2_resting day 4-6 in IL-2 


0.1 


93571_Secondary Trl jresting day 4^6 in 11^2 


0.1 


93568_primary Thl_anti-CD28/anti-CD3 


2.2 


93569_j)rimary Th2_anti-CD28/anti-CD3 


1.5 


93570_pnmary Trl_anti-CD28/anti-CD3 


3.0 


93565_primary Thl jresting dy 4-6 in IL-2 


L3 


93566 j>rimary Th2_re9ting dy 4-6 in IL-2 


0.5 


93567_primary Trl_resting dy 4-6 in IL-2 


1.4 


93351_CD45RA CD4 lymphocytB_anti-CD28/anti-CD3 


1.3 


93352_CD45RO CD4 lymphocyte_anti-CD28/anti-CD3 


0.9 


9325 1_CD8 Lyn:q)hocytcs_anti-CD28/anti-CD3 


0.7 


93353_chronic CD8 Lynqphocytes 2ry_rcsting dy 4-6 in IL-2 


0.7 


93574_chronic CDS Lymphocytes 2ry_activated CD3/CD28 


0.4 


93354_CD4_nDnc 


1.5 


93252_Secondary Thl Ah2/Trl_aiiti-CD95 CHI 1 


02 


93103_LAK cells„resting 


1.3 


93788 LAK cells IL-2 


0.4 
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93787_LAK cell8_IL-2+IL-12 1 .5 

93789_LAKcells_IL-2+IFN gamma 2.2 

9379oIlAK cells_IL-2+ IL-18 1.9 

93104„LAK cell3_PMAyioiiomycin and IL-18 12.8 

93578_NK CeUs rL-2_restiiig 0.4 

93 109_Mixcd Lyn^hocyte Reaction„Two Way MLR 1 .3 

93110_Mixed Lynq>liocyto Reaction^Two Way MLR 0,B 

93 1 1 l_Mixed Lymphocyte Reaction^Two Way MLR 02 

93 1 12_Mononuckar CeUs (PBMCs)_restmg 3.0 

93 1 13„Mcmonuclear CeUs (PBMCsLPWM 4.4 

93 1 14_MononuclBar Cells (PBMCs) JHA-L 1 ■ 1 

93249_Rflmos (B ceU)_none 1 .0 

93250_Ramos (B cell) Jonomycin 2.0 

93349_B lyinphocytcs^PWM 5*9 

93350_B lyinphoytes_CD40L aad E--4 3 .0 

92665_.EOL-l (Eosmophil)_dbcAMP differentiated 1 .0 

93248_EOL-l (Eosmopha)_dbcAMP/PMAionomycm ' 1.9 

93356_r>endritic Cells^none 0,3 

93355_Deadritic Cells_LPS 100 tjg/ml 0.2 

93775 J)endriticCellsL_aiiti-CD40 | 0.1 

93774_MoiicK;ytes_restmg ' 0.6 

93776lMonocytes__LPS 50 ng/ml 0.5 

93581_Macrophages_rcstmg 1.1 

93582_Macraphages_LPS 100 ng/ml 0.6 

93098_HUVEC (Endo&elialLnone 0.8 

93099_HUVEC (EndDfiialial)_starved 1.0 

93100_HUVEC (EndofhelialLIL-lb 0.7 

93779_HtrVEC (EndotheUal)_IFN gamma 03 

93 1 02 JIUVEC (EndothelialLTNF alpha + IFN gamma 1 .3 

9310LHUVEC(Endo(helialLTNFa^ha + IL4 0,9 

93781_HUVEC (EndothelialLIL-H 0.3 

93583_Lung Microvascular Endothelial Cells_nonc 1 . i 

93584„Lung Microvascular Endothelial Cells_TNFa (4 ng/ml) and ILlb (1 ng/ml) 3.2 

92662_Microva8Cular Dermal cndothelmm^none 2. 1 

92663_Microsvasular Dermal endothelium_TNFa (4 ng/ml) and ILlb (1 ng/ml) 2.6 

93773_Bronchial epithelium^TNFa (4 ng/tiil) and ILlb (I ng/ml) *♦ 18.4 

93347_Small Airway Epithelium^none 5.1 

93348_Small Airway Epithelium_TNFa (4 ng/ml) and ILlb (1 ng/ml) 2t9 

92668_Coronery Artery SMC_restmg 1 . 1 

92669_Coronery Artery SMC_TNPa (4 ng/ml) and ILlb (1 Mg/ral) 0.5 

93 1 07_astrocytes_re8ting 1 -5 

93 108_astrocytcs_TNFa (4 ng/ml) and ILlb (1 ng/ml) 1*1 

92666_KU-812 {Basophil)„resting 0.7 

92667_KU-812 (Basophil)„PMA/ionoycin 0.9 

935793CCD1 106 (Keratinocytes)„none 10.6 

93580^CCD1106 (KeTatinocyte3)_TNFa and IFNg ** 5.2 

9379l_Liver Cirrhosis 4.2 

93792_Lupus Kidney 1.1 

93577_NCI-H292 100.0 

93358_NCI-H292_IL-4 90.1 

93360_Na-H292_IL-9 100.0 

93359_,NCI-H292_IL.13 52.5 

93357_NCI-H292^IFN gamma 67.4 

93777_HPAEC„- 0.7 

93778_HPAEC__EL-1 bctamJA alpha 2.8 

93254_Nomial Human Lung Fibroblast^none 0. 1 
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70 Zj^ [NO'inai JCIUIIIEII LfUng riDruoJa»i__jLi.^ra \^ ix^mi) duu xl^x\j \ i. ug/imy 


0 4 


A'^'^CT XTammhI fJ'^iwi'iM T i^mrf 1Pfi1^«'/\1^1nct TT A 


0 4 


AQOCiC XTAvwinl XJ'iimart T itrtri- T7-tKf'/\K1flaf TT _0 

V^/jO INOnilEl xlunioll l^ulijf riOluDiaai__iJ-(^i/ 


0.2 


<^'3')<< 'Kr/\-rma1 IPTiirrtan T llticr 17lKmK1n«t' TT„1 


03 


O^O^Q XTr\r>ma1 UTiiinan T tiTifT 1?iKmh1nct' 1 H'W cfnimnn 
J/j^JO INOmiftl wiimHTi jjuuug f ii3iupiaai._jj/ iri ^nuiijiia. 


07 


yj iuo_jjennEl r mroDiasts l-^u /u_rc8ung 


0.5 


yj^Ol L/CrilltU rivrQUiaSlo V^V^Jflv/v^AANjr mpua t ilg/iui 


0.7 


yJlUD_iJcnii3l rioroDiasTS ^ujL/iu/u_JUj-Ji Dcra i ng/mi 


04 


y377z_Gennai iiOTODiasi^irrs gaiiuxia 


0 t 

Vf i 


yo / / i_Gcnnai nDroDiast_iij-^ 


0 1 




1.5 


yj/ou_ii5U uoiius z 


0 7 


C^^^ti TRTl PrrkTinc 


L4 


73501 0_Coloii_norroaI 


Z9 


73501 9_Limg_iiDnc 


6.8 


64028- l_Thyiiius„none 


2.7 


64030- l_Kjldiiiey_none 


9.6 



Table ION. TaqMan Results, Probe Ag2369 (Panel 4D) 





% Relative 


Tissue Name 


Egression 


93768_Secoiidary Thl_anti-CD28/anti"'CD3 


0.3 


93769_Secondary Th2_anti-CD28/anti-GD3 


0.6 


93770_Secondaxy Trl_anti-CD28/anti-CD3 


0.4 


93573_Secoiidary Hiljwstmg day 4-6 in IL-2 


0.0 


93572_Secoudary ThZresting day 4-6 in iL-2 


0.1 


93571_Secondary lYlresting day 4-6 in 11^2 


0.0 


93568_primary 'nil_anti-CD28/anti-CD3 


1.3 


93569^_primary Th2_anti-CD28/anti-CD3 


1.0 


93570_primary Trl_anti-CI)28/anti-CD3 


L4 


93565_priaiary Thl^resting dy 4-6 in IL-2 


0.7 


93566_priniaiy Th2j:esting dy 4-6 in IL-2 


0.3 


93567_priniaiy Trl^restiiig dy 4-6 in IL-2 


LO 


93351_CD45RA CD4 lyniphocyte_anti-CD28/anti-CD3 


0.6 


93352_CD45RO CD4 lymphocytB_^anti-CD28/anti-CD3 


0.5 


93251_C08 Lyinphocytes_anti-CD28/anti-CD3 


OJ 


93353_Ghronic CDS Lymphocytes 2ry_resting <fy 4^6 in IL-2 


as 


93574_Ghronic CDS Lymphocytes 2ry_activatcd CD3/CD28 


03 


93354 CD4_none 


0.5 


93252_SecondaTy Thl/Th2/Trl_anti-CD9S CHll 


0.1 


93103_LAK ceU8_resting 


0.9 


93788_LAK ceU8_IL-2 


03 


93787_LAK cells_IL-2+IL*12 


1.3 


93789_LAK cclls_IL-2+IFN gamma 


13 


93790_LAK ceUa_IL-2+ IL-18 


1.0 


93104_LAK ceU3_PMA/ionomycin and IL-18 


93 


93578_NK CeUs IL-2_re8ting 


0,2 


.93109_Mixed Lymphocyte Reaction^Two Way MLR 


0.7 


93110_Mixed Lymphocyte Reaction_Two Way MLR 


0.5 


9311 l^Mixed Lyn^hocyte Reaction_Two Way MLR 


03 


93112_Mononuclear Cells (PBMCs)_resting 


1.9 


93113lMononuclear CeUs (PBMCs)_ PWM 


3.0 


93114_MononuclBar Cells (PBMCsLPHA-L 


OJ 
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93249_Ramos (B ccll)_ncme 0.6 

93250_R2i!H)s (B cell) Jonomycin 1^7 

93349_B lymphocytes^PWM 5,7 

93350^ lymphoytcsJcD40L and IL-4 2.7 

92665l^EOI^l (Eosinophil)_dbcAMP differentiated 0 J 

93248„EOI^l (Eo8inoplulL<ibcAMP/PMAionoinycin L5 

93356_Dendritic Cells_none 0.2 

93355„Dendritic Cells^LPS 100 ng/ml 0.1 

93775_Dendritic CeUs_anti-CD40 0,2 

93774_Monocytes_re3ting 0.6 

937762Monocyte3„LPS 50 ng/ml 0.4 

9358 IMacrophflges^resting 0.6 

93582_MacrDphagCB_LPS 100 ng/ml 0.4 

93098__HUVEC (EndothclialLnone 0.6 

93099JIUVEC (EndothcHalLstarved 0.6 

93100_HUVEC (Endotiiclial)_IL"lb 0.4 

93779„HUVEC (EndotbelialLIFN gamma 0.3 

93 lOalHUVEC (EndothelialLTNF alpha + IFN gamma 1 . 1 

93101_HUVEC {Endoihelial)_TNF alpha -^HA M 

93781 JIUVEC (Endothelial)J[L-n | 0.2 

93583_Liing Microvascular Bndodielial Cellfi_none ' 2.4 

93584_Lung Microvascular Endothelial Cells^TNFa (4 ng/ml) and ILlb (1 ng/ml) 3.4 

92662_Microva8cular Dermal endothelmm^none 2.1 

92663_Microsvasular Dcnnal endothelium^TNFa (4 ng/ml) and E.lb (1 ng/mfl) 1.8 

93773lBronchial epithelium_TNFa (4 ng/ml) and ILlb (1 ng/ml) ** 3.2 

93347_Sniall Airway Epithclium_none 3,1 

93348_Sniall Airway Epithelium_TNFa (4 ng/ml) and ILlb (1 ng/ml) 20.8 

92668_Coronery Artery SMC_resting 0.8 

92569_Coronery Artery SMC^TNFa (4 ng/ml) and ILlb (1 ng/m3) 0.4 

93 107_astrocytes_resting 1 ♦4 

93 108_astrocytes_TNFa (4 ng/ml) and ILlb (1 ng/ml) 1 .3 

92666_KU-812 (Ba8ophil)„resting 0.6 

92667IkU-812 (Basophil)_PMA/ionoycin 1.2 

93579_CCD1 106 (Keratinocytes)_none 1 1 .7 

93580_CCD1 106 (Keratinocytes)_IOTa and IFNg L4 

9379 l_Liver Cirrhosis 3.0 

93792 Lupus Kidney OJ 

93577INCI-EL292 96,6 

93358_NCI-H292_IL-4 92.2 

93360„NCI-H292^IL^9 100.0 

93359_Na^H292_IL-13 56J 

93357_NCI-H292_IFN gamma 75.0 

93777_HPAEC_- 0.4 

93778IhPAEC_IL-1 beta/TNA alpha 1 .4 

93254_Norraal Human Lung Fibroblast^none 0.2 

93253_Normal Human Lung FibrobJast^TNFa (4 ng/ml) and IL-lb (1 ng/ml) 0.2 

93257_Noonal Human Lung Fibroblast__II^4 0.5 

93256_Normal Human Lung Fibroblast_II^9 0.3 

93255_Normal Human Lung Fibroblast_^II^13 0.2 

93258_Nomial Human Lung Fibroblast_IFN gamma 0.7 

93106_Dermal Fibroblasts CCD1070„resting 0.3 

93361_Dermal Fibroblasts CCD! 070_,TNF alpha 4 ng/ml 0.5 

93105_Dermal Fibroblasts CCD1070_IL-1 beta 1 ng/ml 0.2 

93772_dermal fibroblast„IFN gamma 0.2 

9377Ldemial fibroblast_IL-4 0.0 

93259_IBDCoUtis,l** 0.3 
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93260_IBD Colitis 2 0.5 

93261_IBD Crohns 1.1 

735010_Colon_nonnal 2,6 

735O19_Lung_n0iie 6.2 

64028- l_Thymu9_none 1 .8 

64030-l_Kidney_none 8.4 



The nucleic acid and protein of the invention are useful in potential ther^eutic 
applications implicated^ for example but not limited to, in Adrenoleukodystrophy, Congenital 
Adrenal Hyperplacia, Polycystic Kidney Disease, Stenosis, Interstitial Nephritis, 
5 Glomerulonephritis, Atherosclerosis, Hypertension, Congenital Heart Defects, Aortic Stenosis, 
Atrial Septal Defect, AMieimer's Disease, Stroke, Tuberous Sclerosis, Hypercalceimia, 
Parkinson's Disease, and other diseases and disorders. Potential therapeutic uses for the 
invention(s) are, for example but not limited to, the following: (i) Protein ther^eutic, (ii) 
small molecule drug target, (iii) antibody target (dierapeutic, dia^ostic, drug 

10 targeting/cytotoxic antibody), (iv) diagnostic and/or prognostic marker, (v) gene ther^y 

(gene deUvery/gene ablation), (vi) research tools, and (vii) tissue regeneration in vitro and in 
vivo (regeneration for all these tissues and cell types composing these tissues and cell types 
derived jfrom these tissues e.g., adrenal gland, kidneyj brain, and heart. 

The nucleic acids and proteins of the invention are useful in potential therapeutic 

15 applications implicated in various diseases and disorders described below and/or other 

pathologies and disorders, For example, but not limited to, a cDNA encoding the human salt- 
inducible protein kinase-like protein may be useful in gene therapy, and the Human salt- 
inducible protein kinase-like protein may be useftil when administered to a subject in need 
thereof. By way of non-limiting example, the compositions of the present invention will have 

20 efficacy for treatment of patients suffering from, for example, but not Hmited to, 

Adrenoleukodystrophy, Congenital Adrenal Hyperplacia, Polycystic Kidney Disease, Stenosis, 
Interstitial Nephritis, Glomerulonephritis, Atherosclerosis, Hypertension, Congenital Heart 
Defects, Aortic Stenosis, Atrial Septal Defect, Alzheimer's Disease, Stroke, Tuberous 
Sclerosis, Hypercalceunia, Parkinson's Disease, and other diseases and disorders. The novel 

25 nucleic acid encoding the Human salt-inducible protein kinase-like protein, and the human 

salt-inducible protefaa kinase-like protein of the invention, or fragments thereof, may further be 
useful in diagnostic apphcations, wherein the presence or amount of the nucleic acid or the 
protein are to be assessed. These materials are further useful in the generation of antibodies 
that bind immunospecifically to the novel substances of the invention for use in therapeutic or 

30 diagnostic methods. 
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These materials are further useful in the generation of antibodies that bind immuno- 
specifically to the novel NOVlO substances lor use in therapeutic or diagnostic methods. 
These antibodies may be generated according to methods known in the art, using prediction 
from hydrophobicity charts, as described in the "Auti-NOVX Antibodies" section below. For 

5 example the disclosed NOVlO protein has multiple hydrophilic regions, each of which can be 
used as an immunogen. In one embodiment, a contemplated NOVlO epitope is from about 
amino acids 5 to about amino acid 40. In another embodiment, a NOVlO epitope is from 
about amino acids 225 to 240. In additional embodiments, NOVlO epitopes are from about 
amino acids 50 to 90; from about ammo acids 105 to 175; from about amino acids 180 to 210; 

10 from about amiao acids 280 to 400; from about amino acids 450 to 490; and from about amino 
acids 580 to 680. These novel proteins can also be used to develop assay systems for 
functional analysis. 

Example 1. Quantitative expression analysis of clones in various cells and tissues 

The quantitative expression of various clones was assessed using microtiter plates 

1 5 containing RNA samples from a variety of normal and pafliology-derived cells, cell lines and 
tissues using real time quantitative PGR (RTQ PGR; TAQMAN^. RTQ PGR was performed 
on a Perkin-Elmer Bio^stems ABI PRISM® 7700 Sequence Detection System. Various 
collections of samples are assembled on the plates, and referred to as Panel 1 (containing cells 
and cell lines from normal and cancer sources), Panel 2 (containing samples derived from 

20 tissues, in particular from surgical samples, from normal and cancer sources), Panel 3 

(cohtaming samples derived from a wide variety of cancer sources) and Panel 4 (containing 
cells and cell lines &om nonnal cells and cells related to inflammatory conditions). , 

First, the RNA samples were normalized to constitutively expressed genes such as P- 
actin and GAPDH. RNA (-^50 ng total or ~1 ng polyA4-) was converted to cDNA using the 

25 TAQMAN® Reverse Transcription Reagents Kit (PE Biosystems, Foster Gity, OA; Catalog 
No, N808'0234) and random hexamers according to the manufacturer's protocol. Reactions 
were perfomied in 20 ul and incubated for 30 min* at 48°C. cDNA (5 ul) was then transferred 
to a separate plate for the TAQMAN® reaction using P-actin and GAPDH TAQMAN® 
Assay Reagents (PE Biosystems; Catalog Nos. 4310881E and 4310884E, respectively) and 

30 TAQMAN® universal PCR Master Mix (PE Biosystems; Catalog No. 4304447) according to 
the manufacturer's protocol. Reactions were performed in 25,nl using the following 
parameters: 2 min. at 50**G; 10 min. at 95°C; 15 sec. at 95^0/1 min. at 60^C (40 cycles). 
Results were recorded as CT values (cycle at which a given sample crosses a threshold level of 
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fluorescence) using a log scale, with the difference in RNA concentraticm between a given 
sample and the sample with the lowest CT value being represented as 2 to the power of delta 
CT* The percent relative expression is then obtained by taking the reciprocal of this RNA 
diflference and multiplying by 100. The average CT values obtained for B-actin and GAPDH 
were used to normalize KNA samples. The RNA sample generating the highest CT value 
required no further diluting, while all other samples were diluted relative to this sample 
according to their p-actin /GAPDH average CT values. 

Normalized RNA (5 ul) was converted to cDNA and analyzed via T AQMAN® using 
One Step RT-PCR Master Mix Reagents (PE Biosystems; Catalog No. 4309169) and gene- 
specific primers according to the manufacturer's instructions. Probes and primers were 
designed for each assay according to Perkin Ehner Biosysicm's Primer Express Software 
package (version I for Apple Computer's Macintosh Power PC) or a similar algorithm using 
the target sequence as input. Default settings were used for reaction conditions and the 
following parameters were set before selecting primers: primer concentration = 250 nM, 
primer melting temperature (TnO range = 58*^-60^ C, primer optimal Tm - 59^ C, maximum 
primer difference = 2° C, probe does not have 5* G, probe Tm must be 10^ C greater than 
primer Tm, amplicon size 75 bp to 100 bp. The probes and prim^ selected (see below) were 
synthesized by Synthegen (Houston, TX, USA). Probes were double purified by HPLC to 
remove uncoupled dye and evaluated by mass spectroscopy to verify coupling of reporter and 
quencher dyes to the 5' and 3' ends of the probe, respectively. Then final concentrations 
were: forward and reverse primers^ 900 nM each, and probe, 200nM. 

PCR conditions: Normalized RNA firom each tissue and each cell line was spotted in 
each weU of a 96 well PCR plate (Perkin Ehner Biosystems), PCR cocktails including two 
probes (a probe specific for the target clone and another gene-specific probe multiplexed with 
the target probe) were set up using IX TaqMan™ PGR Master Mix for the PE Biosystems 
7700, with 5 mMMgC12, dNTPs (dA, G, C, U at 1:1:1:2 ratios), 0.25 U/ml AmpUTaq Gold™ 
(PE Biosystems), and 0.4 U/jlU RNase inhibitor, and 0.25 \J/\xl reverse transcriptase. Reverse 
transcription was periFormed at 48" C for 30 minutes followed by amplification/PCR cycles as 
foUows: 95° C 10 min, then 40 cycles of 95" C for 15 seconds, 60" C for 1 minute. 

The following abbreviations are used in the panels: ca. ^ carcinoma, * ^ established 
jSrom metastasis, met ^ metastasis, s cell var= small cell variant, non-s = non-sm =non-small, 
squam squamous, pi. eff = pi effusion = pleural effusion, glio glioma, astro ^ astrocytoma, 
and neuro = neuroblastoma. 
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Panel 2 

The plates for Panel 2 generally include 2 control wells md 94 test samples composed 
of RNA or cDNA isolated from human tissue procured by surgeons working in close 
cooperation with the National Cancer Institute's Cooperative Human Tissue Network (CHTN) 
5 or the National Disease Research Initiative (NDRI). The tissues are derived from human 
malignancies and in cases where indicated many mahgnant tissues have '^matched margms" 
obtained from noncancerous tissue just adjacent to the tumor. These are termed nomial 
adjacent tissues and are denoted **NrAT" in the resxilts below. The tumor tissue and the 
"matched margins'' are evaluated by two independent pathologists (the surgical pathologists 
10 and again by a pathologists at NDRI or CHTN). This analysis provides a gross 

histopathological assessment of tumor differentiation grade. Moreover, most samples include 
the original surgical patholo©^ report that provides information regarding the clinical stage of 
the patient. These matched margins are taken from the tissue sutrounding (i.e. immediately 
proximal) to the zone of surgery (designated 'TSfAT", for nonnd adjacent tissue, in Table RR). 
15 In addition, RNA and cDNA samples were obtained from various human tissues derived from 
autopsies performed on elderly people or sudden death victims (accidents, etc.). These tissue 
were ascertained to be free of disease and were purchased from various commercial sources 
such as Clontech (Palo Alto, CA), Research Genetics, and Invitrogen. 

RNA integrity from all samples is controlled for quality by visual assessment of 
20 agarose gel electropherograms using 28S and 18S ribosomd RNA staining intensity ratio as a 
guide (2:1 to 2.5:1 28s:18s) and the absence of low molecular weight RNAs that would be 
indicative of degradation products. Samples are controlled against genomic DNA 
contamination by RTQ PGR reactions run in the absence of reverse transcriptase using probe 
and primer sets designed to amplify across the span of a single exon. 
25 Panel 4 

Panel 4 iucludes samples on a 96 well plate (2 control wells, 94 test samples) 
composed of RNA (Panel 4r) or cDNA (Panel 4d) isolated from various human cell lines or 
tissues related to inflammatory conditions. Total RNA from control normal tissues such as 
colon aud lung (Stratagene ,La Jolla, CA) and thymus and kidney (Clontech) were employed, 
30 Total RNA from Uver tissue from cirrhosis patients and kidney from lupus patients was 

obtained from BioChain (Biochain Institute, Inc., Haywanl,CA). Intestinal tissue for RNA 
preparation from patients diagnosed as having Crohn's disease and ulcerative colitis was 
obtamed from the National Disease Research Interchange (NDRI) (Philadelphia, PA). 
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Astrocytes, lung fibroblasts, dermal fibroblasts, coronary artery smooth muscle cells, 
small airway epithelium, bronchial epithelium, microvascular dermal endotheUai cells, 
microvascular lung endothelial cells^ human pubnonary aortic endothelial cells, human 
umbilical vein endothelial cells were all purchased from Clonetics (Walkersville, MD) and 

5 grown in the media suppUed for these cell types by Clonetics. These primary cell types were 
activated with various cytokines or combinations of cytokines for 6 and/or 12-14 hours, as 
indicated. The following cytokines were used; IL-1 beta at approximately 1-5 ng/ml, TNF 
alpha at approximately 5-10 ng/ml, IFN ganmia at ^proximately 20-50 ng/ml, IL-4 at 
approximately 5-10 ng/ml, IL-9 at approximately 5-10 ng/ml, rL-13 at approximately 5-10 

10 ng/ml. Endothelial cells were sometimes starved for various tames by culture in the basal 
media from Clonetics with 0. 1 % serum. 

Mononuclear cells were prepared from blood of employees at CuraGen Corporation, 
using FicolL LAK cells were prepared from these cells by cultufb.in DMEM 5% FCS 
(Hyclone), 100 non essential amino acids (Gibco/Life Technologies, Rockville, MD), 1 

15 mM sodium pyruvate (Gibco), mercaptoethanol 5*5 x 10"^ M (Gibco), and 10 mM Hepes 
(Gibco) and Interleukin 2 for 4-6 days. Cells were then either activated with 10-20 ng/ml 
PMA and 1-2 ^ig/ml ionomycin, IL-12 at 5-10 ng/ml, IFN gamma at 20-50 ng/ml and IL-18 at 
5-10 ng/ml for 6 hours. In some cases, mononuclear cells were cultured for 4-5 days in 
DMEM 5% FCS (Hyclone), 100 ^M non essential amino acids (Gibco), 1 mM sodium 

20 pyruvate (Gibco), mercaptoethanol 5.5 x 10"^ M (Gibco), and 10 mM Hepes (Gibco) with 

PHA (phytohemagglutinin) or PWM (pokeweed mitogen) at approximately 5 |ag/ml Samples 
were taken at 24, 48 and 72 hours for RNA preparation. MLR (mixed lymphocyte reaction) 
samples were obtained by taking blood from two donors, isolating the mononuclear cells using 
FicoU and mixing the isolated mononuclear cells 1:1 at a final concentration of approximately 

25 2x10^ cells/ml in DMEM 5% FCS OSyclone), 100 ^M non essential amino acids (Gibco), 1 
mM sodium pyruvate (Gibco), mercaptoethanol (5.5 x 10"^ M) (Gibco), and 10 mM Hepes 
(Gibco). The MLR was cultured and samples taken at various time points ranging from 1- 7 
days for RNA preparation. 

Monocytes were isolated from mononuclear cells using CD14 Miltenyi Beads, -l-ve VS 

30 selection columns and a Vario Magnet according to the manufacturer's mstmctions. 

Monocytes were differentiated into dendritic cells by culture in DMEM 5% fetal calf serum 
(FCS) (Hyclone, Logan, UT), 100 |iM non essential amino acids (Gibco), 1 mM sodium 
pyruvate (Gibco), mercaptoethanol 5.5 x 10'^ M (Gibco), and 10 mM Hepes (Gibco), 50 ng/ml 
GMCSF and 5 ng/ml IL-4 for 5-7 days. Macrophages were prepared by culture of monocytes 
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for 5-7 days in DMEM 5% FCS (Hyclone), 100 |iM ttou essential amino acids (Gibco), 1 mM 
sodium pyruvate (Gibco), mercaptoethanol 5.5 x 10"^ M (Gibco), 10 mM Hepes (Gibco) and 
10% AB Human Serum or MCSF at approximately 50 ng/ml. Monocytes, macrophages and 
dendritic cells were stimulated for 6 and 12-14 hours with lipopolysaccharidc (LPS) at 100 
5 ng/ml. Dendritic cells were also stimulated with anti-CD40 monoclonal antibody 
CPharmingen) at 10 |4g/ml for 6 and 12-14 hours. 

CD4 lymphocytes, CDS lymphocytes and NK cells were also isolated from 
mononuclear cells using CD4, CDS and CD56 Miltenyi beads, positive VS selection columns 
and a Vario Magnet according to the manufacturer's instructions. CD45RA and CD45RO CD4 

1 0 lymphocytes were isolated by depleting mononuclear cells of CDS, CD56, CD 14 and CD 1 9 
cells using CDS, CD56,CD14 and CD19 Miltenyi beads and +ve selection. ThenCD45RO 
beads were used to isolate the CD45RO CD4 lymphocytes with the remainmg cells being 
CD45RA CD4 lymphocytes, CD45RA CD4, CD45RO CD4 arid CDS lymphocytes were 
placed in DMEM 5% FCS (Hyclone), 100 ^iM non essential amino acids (Gibco), 1 mM 

1 5 sodium pyruvate (Gibco), mercaptoethanol 5.5 x 10^^ M (Gibco), and 10 mM Hepes (Gibco) 
and plated at 10^ cells/ml onto Falcon 6 well tissue culture plates that had been coated 
overnight with 0.5 |ig/ml anti-CD28 (Pharmingen) and 3 ng/ml anti-CD3 (OKT3, ATCC) in 
PBS. After 6 and 24 hours, the cells were harvested for RNA preparation. To prepare 
chronically activated CDS lymphocytes, we activated the isolated CDS lymphocytes for 4 days 

20 on anti-CD28 and anti-CD3 coated plates and then harvested the cells and expanded them in 
DMEM 5% FCS (Hyclone), 100 i^M non essential amino acids (Gibco), 1 mM sodium 
pyruvate (Gibco), merc^toethanol 5.5 x 10"^ M (Gibco), and 10 mM Hepes (Gibco) and IL-2. 
The expanded CDS cells were then activated again with plate bound anti-CD3 and anti-CD28 
for 4 days and expanded as before. RNA was isolated 6 and 24 hours after the second 

25 activation and after 4 days of flie second expansion culture. The isolated NK cells were 
cultured in DMEM 5% FCS (Hyclone), 100 jiM non essential amino acids (Gibco), 1 mM 
sodium pyruvate (Gibco), mercq)toethanol 5*5 x 10"^ M (Gibco), and 10 mM Hepes (Gibco) 
and IL-2 for 4-6 days before RNA was prepared. 

To obtain B cells, tonsils were procured from NDRI. The tonsil was cut up with sterile 

30 dissecting scissors and then passed through a sieve. Tonsil cells were then spun down and 
resupended at 10" cells/ml in DMEM 5% FCS (Hyclone), 100 ^M non essential amino acids 
(Gibco), 1 mM sodium pyruvate (Gibco), mercaptoethanol 5.5 x 10*^ M (Gibco), and 10 mM 
Hepes (Gibco). To activate the cells, we used PWM at 5 ^ig/ml or anti-CD40 (Pharmingen) at 
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approximately 10 p.g/ml and IL-4 at 5-10 ng/ml. Celk were harvested for RNA preparation at 
24,48 and 72 hours. 

To prepare the primary and secondary Thl/Th2 and Trl cells, six-well Falcon plates 
were coated overnight with 10 \ig/ml anti-CD28 (Phaimingen) and 2 (ig/ml 0KT3 (ATCC), 
5 and then washed twice with PBS. UmbiUcal cord blood CD4 lymphocytes (Poietic Systems, 

5 6 

Germaia Town, MD) were cultured at 10 -10 cells/ml in DMEM 5% PCS (Hyclone), 100 ^iM 
non essential amino acids (Gibco), 1 mM sodium pymvate (Gibco), mercaptoethanol5.5 x 10" 
^ M (Gibco), 10 mM Hepes (Gibco) and IL-2 (4 ng/ml). IL-12 (5 ng/ml) and anti-IL4 (1 
|xg/ml) were used to direct to Thl , while IL-4 (5 ng/ml) and anti-IFN gamma (1 ng/ml) were 

10 used to direct to Th2 and XL- 10 at 5 ng/ml was used to direct to TrL After 4-5 days, the 

activated Thl, Th2 and Trl lymphocytes were washed once in DMEM and expanded for 4-7 
days in DMEM 5% PCS (Hyclone), 100 \iM non essential amino acids (Gibco), 1 mM sodium 
pyruvate (Gibco), mercaptoethanol 5.5 x 10"^ M (Gibco), 10 mM Hepes (Gibco) and IL-2 (1 
ng/ml). Follo^\ing this, the activated Thl, Th2 and Trl lymphoc>1es were re-stimulated for 5 

1 5 days with aiiti-CD28/OKT3 and cytokines as described above, but with the addition of anti- 
CD95L (1 )ig/ml) to prevent apoptosis. After 4-5 days, the Thl, Th2 and Trl lymphocytes 
were washed and then expanded again with IL-2 for 4-7 days. Activated Thl and Th2 
lymphocytes were maintained in this way for a maximum of three cycles, RNA was prepared 
from primary and secondary Thl ^ Th2 and Trl after 6 and 24 hours following the second and 

20 third activations with plate bound anti-CD3 and anti-CD28 mAbs and 4 days into the second 
and third expansion cultures in Interleukin 2. 

The following leukocyte cells hnes were obtained from the ATCC: Ramos, EOL-1, 
KU-812. EOL cells were ftirther differentiated by culture in 0.1 mM dbcAMP at 5 xlO^ 
cells/ml for 8 days, changing the media every 3 days and adjusting the cell concentration to 5 

25 xl 0^ cells/ml. For the culture of these cells, we used DMEM or RPMI (as recommended by 
the ATCC), with the addition of 5% PCS (Hyclone), 100 yM non essential amino acids 
(Gibco), 1 mM sodium pyruvate (Gibco), mercaptoethanol 5.5 x 10'^ M (Gibco), 10 mM 
Hepes (Gibco). RNA was eith^ prepared from resting cells or cells activated with PMA at 10 
ng/ml and ionomycin at 1 |ig/ml for 6 and 14 hours. Keratinocyte line CCD106 and an airway 

30 epithehal tumor line NCI-H292 were also obtained from the ATCC. Both were cultured in 
DMEM 5% PCS (Hyclone), 100 non essential amino acids (Gibco), 1 mM sodium 
pyruvate (Gibco), mercaptoethanol 5.5 x 10'^ M (Gibco), and 10 mM Hepes (Gibco)* 
CCDl 106 cells were activated for 6 and 14 hours with approxunately 5 ng/ml TNF alpha and 
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1 ng/ml IL-1 beta, while NCI-H292 cells were activated for 6 and 14 hours with the following 
cytokines; 5 ng/ral IL-4, 5 ug/ml 5 ng/ml IL-1 3 and 25 ng/ml lEN gamma. 

For these cell lines and blood cells, RNA was prepared by lysing approximately 10^ 
cells/ml using Trizol (Gibco BRL). Briefly, 1/10 volume of bromochloropropane (Molecular 

5 Research Corporation) was added to the RNA sample, vortexed and after 10 minutes at room 
temperature, the tubes were spun at 14,000 rpm in a Sorvall SS34 rotor. The aqueous phase 
was removed and placed in a 1 5 ml Falcon Tube. An equal volume of isopropanol was added 
and left at -20 degrees C overnight The precipitated RNA was spun down at 9,000 rpm for 
15 min in a Sorvall SS34 rotor and washed in 70% ethanol The pellet was redissolved in 300 

10 \xl of RNAse-^free water and 35 nl buffer (Promega) 5 ^il DTT, 7 |il RNAsin and 8 jul DNAse 
were added The tube was incubated at 37 degrees C for 30 minutes to remove contaminating 
genomic DNA, extoacted once with phenol chloroform and re-precipitated with 1/10 volume 
of 3 M sodium acetate and 2 volumes of 100% ethanol. The RKA was spun down and placed 
in RNAse free water. RNA was stored at -80 degrees 

15 NOVX Nucleic Acids and Polypeptides 

One aspect of tiie invention pertains to isolated nucleic acid molecules that encode 
NOVX polypeptides or biologically active portions thereof Also included in the invention are 
nucleic acid fragments sufficient for use as hybridization probes to identify NOVX-encoding 
nucleic acids NOVX mRNAs) and fragments for use as PGR primers for the 
20 amplification and/or mutation ofNOVX nucleic acid molecules. As used herein, the term 
"nucleic acid molecule" is intended to include DNA molecules (e,^., cDNA or genomic 
DNA), RNA molecules (e.g., mRNA), analogs of the DNA or RNA generated using 
nucleotide analogs, and derivatives, fragments and homologs thereof. The nucleic acid 
molecule may be single-stranded or double-stranded, but preferably is comprised double- 

25 stranded DNA. 

An NOVX nucleic acid can encode a mature NOVX polypeptide. As used herein, a 
"mature" form of a polypeptide or protein disclosed in the present invention is die product of a 
naturally occurring polypeptide or precursor form or proprotein. The naturally occurring 
polypeptide, precursor or proprotein includes, by way of nonlimiting example, tiae full-length 

30 gene product, encoded by tiie corresponding gene, Alternatively, it may be defined as the 
polypeptide, precursor or proprotein encoded by an ORF described herein. The product 
^taature" form arises, again by way of nonlimiting example, as a result of one or more 
naturally occurring processing steps as they may take place withm the cell, or host cell, in 
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which Ihe gene product arises. Examples of such processing steps leading to a ''mature** form 
of a polypeptide or protein include the cleavage of the N-terminal methionine residue encoded 
by the initiation codon of an ORF, or tiie proteolytic cleavage of a signal peptide or leader 
sequence. Thus a mature form arising from a precursor polypeptide or protein that has 

5 residues 1 to N, where residue 1 is the N-terminal methionine, would have residues 2 through 
N remaining after removal of the N-terminal methionine, Alternatively, a mature form arising 
from a precursor polypeptide or protein having residues 1 to N, in which an N-terminal signal 
sequence from residue 1 to residue M is cleaved, would have the residues from residue M+1 to 
residue N remaining. Further as used herein, a '"mature" form of a polypeptide or protein may 

1 0 arise from a st^ of post-translational modification other than a proteolytic cleavage event* 
Such additional processes include, by way of non-limiting example, glycosylation, 
rrtyristoylation or phosphorylation. In general, a mature polypeptide or protein may result 
from the operation of only one of these processes, or a combinatijoji of any of them. 

The term '^probes", as utilized herein, refers to nucleic acid sequences of variable 

1 5 length, preferably between at least about 1 0 nucleotides (nt), 1 00 nt, or as many as 

approximately, e.g., 6,000 nt, depending upon the specific use. Probes are used in the 
detection of identical, similar, or complementary nucleic acid sequences* Longer length 
probes are generally obtained from a natural or recombinant source, are highly specific, and 
much slower to hybridize than shorter-length ohgomer probes. Probes may be single- or 

20 double-stranded and designed to have specificity in PGR, membrane-based hybridization 
technologies, or ELISA-like technologies. 

The term "isolated" nucleic acid molecule, as utilized herein, is one, which is separated 
from other nucleic acid molecules which are present in the natural source of the nucleic acid. 
Preferably, an "isolated" nucleic acid is free of sequences which naturally flank the nucleic 

25 acid {ie., sequences located at the 5'- and 3*"tennini of the nucleic acid) in the genomic DNA 
of the organism from which the nucleic acid is derived. For example, in various embodiments, 
the isolated NOVX nucleic acid molecules can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 
kb, 0.5 kb or 0.1 kb of nucleotide sequences which naturally flank the nucleic acid molecule in 
genomic DNA of the cell/tissue from which tiie nucleic acid is derived (e.g., brain, heart, liver, 

30 spleen, etc.). Moreover, an "isolated" nucleic acid molecule, such as a cDNA molecule, can 
be substantially free of other cellular material or culture medium when produced by 
recombinant techniques, or of chemical precursors or other chemicals when chemically 
synthesized* 
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A nucleic acid molecule of the invention, e.g. , a nucleic acid molecule having the 
nucleotide sequence SEQ ID N0S:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, or a complement 
of this aforementioned nucleotide sequence, can be isolated using standard molecular biology 
techniques and tbe sequence information provided herein. Using all or a portion of the nucleic 
5 acid sequence of SEQ ID N0S:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25 as a hybridization 
probe, NOVX molecules can be isolated using standard hybridization and cloning techniques 
(e.g., as described in Sambrook, et ai, (eds.), MOLECULAR CLONING: A LABORATORY 
Manual 2"^ Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989; and 
Ausubel, et al, (eds.), CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, 
10 New York, NY, 1993.) 

A nucleic acid of the invention can be amplified using cDNA, mRNA or alternatively, 
genomic DNA, as a template and ^propriate oligonucleotide primers according to standard 
PGR amplification techniques. The nucleic acid so amplified oksx be cloned into an 
appropriate vector and characterized by DNA sequence analysis, Purtiiermore, 
15 oligonucleotides corresponding to NOVX nucleotide sequences can be prepared by standard 
synthetic techniques, e.g., using an automated DNA synthesizer. 

As used h^ein, the term "ohgonucleotide" refers to a series of linked nucleotide 
residues, which oligonucleotide has a sufficient number of nucleotide bases to be used in a 
PCR reaction. A short oligonucleotide sequence may be based on, or designed from, a 
20 genomic or cDNA sequence and is used to ampUfy, confirm, or reveal the presence of an 
identical, similar or complementary DNA or RNA in a particular cell or tissue. 
Oligonucleotides comprise portions of a nucleic acid sequence having about 10 nt, SO nt, or 
100 nt in length, preferably about 15 nt to 30 nt in length. In one embodiment of the 
invention, an oligonucleotide comprising a nucleic acid molecule less than 100 nt in length 
25 would further comprise at least 6 contiguous nucleotides SEQ ID N0S;1, 3, 5, 7, 9, 1 1, 13, 15, 
17, 19, 21, 23, and 25, or a complement thereof. Oligonucleotides may be chemically 
synthesized and may also be used as probes. 

In another embodiment, an isolated nucleic acid molecule of the invention comprises a 
nucleic acid molecule that is a complement of the nucleotide sequence shown in SEQ ID 
30 N0S:1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, and 25, or a portion of this nucleotide sequence 
(e.g., a jBragment that can be used as a probe or primer or a firagment encoding a biologically- 
active portion of an NOVX polypeptide). A nucleic acid molecule that is complementary to 
the nucleotide sequence shown SEQ ID N0S:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, or 25 is 
one that is sufficiently complementary to the nucleotide sequence shown SEQ ID N0S:1> 3, 5, 
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7, 9, 1 1> 13, 15, 17, 19, 21, 23, or 25 that it can hydrogen bond with little or no mismatches to 
the nucleotide sequence shown SEQ ID NOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, and 25, 
thereby forming a stable duplex* 

As used herein^ the term "complementary" refers to Watson-Crick or Hoogsteen base 

5 pairing between nucleotides units of a nucleic acid molecule, and the term ''binding" means 
the physical or chemical interaction between two polypeptides or compounds or associated 
polypeptides or compounds or combinations thereof Binding includes ionic, non-ionic, van 
der Waals, hydrophobic interactions, and the like. A physical interaction can be either direct 
or indkect. Indirect interactions may be through or due to the effects of another polypeptide or 

10 compound. Direct binding refers to interactions that do not take place through, or due to, the 
effect of another polypeptide or compound, but instead are without other substantial chemical 
intermediates. 

Fragments provided herein are defined as sequences of all least 6 (contiguous) nucleic 
acids or at least 4 (contiguous) amino acids, a length sufficient to allow for specific 

15 hybridization in the case of nucleic acids or for specific recognition of an epitope in the case of 
amino acids, respectively, and are at most some portion less than a fiill length sequence. 
Fragments may be derived fi:om any contiguous portion of a nucleic acid or amino acid 
sequence of choice. Derivatives are nucleic acid sequences or amino acid sequences formed 
from the native compounds either directly or by modification or partial substitution. Analogs 

20 are nucleic acid sequences or amino acid sequiences that have a stmcture similar to, but not 
identic^ to, the native compound but difEers from it in respect to certain components or side 
chains* Analogs may be synthetic or from a different evolutionary origin and may have a 
similar or opposite metabohc activity compared to wild type. Homologs are nucleic acid 
sequences or amino acid sequences of a particular gene that are derived from different species. 

25 D^v^ves and analogs may be full length or other than full lengtti, if the derivative or 

analog contains a modified nucleic acid or amino acid, as described below. Derivatives or 
analogs of the nucleic acids or proteins of the invention include, but are not limited to, 
molecules comprising regions that are substantially homologous to the nucleic acids or 
proteins of the invention, in various embodiments, by at least about 70%, 80%, or 95% 

30 identity (with a preferred identity of 80-95%) over a nucleic acid or amino acid sequence of 
identical size or when compared to an aligned sequence in which the aligoment is done by a 
computer homology program known in the art, or whose encoding nucleic acid is capable of 
hybridizing to the complement of a sequence encoding the aforementioned proteins under 
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Stringent, moderately stringent, or low stringmt conditions. See e.g. Ausubel, et al, CURRENT 
PROTOCOLS IN Molecular Biology, John Wiley & Sons, New York, NY, 1993, and below. 

A 'liomologous nucleic acid sequence" or *1iomologous amino acid sequence," or 
variations thereof, refer to sequences characterized by a homology at the nucleotide level or 
amino acid level as discussed above. Homologous nucleotide sequences encode those 
sequences coding for isofonns of NOVX polypeptides. Isoforms can be expressed in different 
tissues of the same organism as a result of, for example, alternative splicing of RNA. 
Alternatively, isofomis can be encoded by different genes. In the invention, homologous 
nucleotide sequences include nucleotide sequences encoding for an NOVX polypeptide of 
species oflier than humans, including, but not limited to; vertebrates, and thus can include, e,g,, 
frog, mouse, rat, rabbit, dog, cat cow, horse, and other organisms. Homologous nucleotide 
sequences also include, but are not Umited to, naturally occurring dlelic variations and 
mutations of the nucleotide sequences set forth herein. A homdlogous nucleotide sequence 
does not, however, include the exact nucleotide sequence encoding human NOVX protein. 
Homologous nucleic acid sequences include those nucleic acid sequences that encode 
conservative amino acid substitutions (see below) in SEQ K> NOS: 1, 3> 5, 7, 9, U, 13, 15, 17, 
19, 21, 23, and 25, as well as a polypeptide possessing NOVX biological activity. Various 
biological activities of the NOVX proteins are described below. 

An NOVX polypeptide is encoded by the open reading frame ("ORF") of an NOVX 
nucleic acid. An ORF corresponds to a nucleotide sequence that could potentially be translated 
into a polypeptide. A stretch of nucleic acids comprising an ORF is unintcarmpted by a stop 
codon. An ORF that represents the coding sequence for a M protein begins with an ATG 
"start" codon and terminates with one of the three "stop" codons, namely, TAA, TAG, or 
TGA. For the purposes of this mvention, an ORF may be any part of a coding sequence, vsdth 
or without a start codon, a stop codon, or both. For an ORF to be considered as a good 
candidate for coding for a bona fide cellular protein, a minimum size requirement is often set, 
e,g., a stretch of DNA that would encode a protein of 50 amino acids or more. 

The nucleotide sequences determined from the cloning of the human NOVX genes 
allows for the generation of probes and primers designed for use in identifying and/or cloning 
NOVX homologues in other cell types, e.g. from other tissues, as well as NOVX homologues 
from other vertebrates. The probe/primer typically comprises substantially purified 
oligonucleotide. The oligonucleotide typically comprises a region of nucleotide sequence that 
hybridizes under stringent conditions to at least about 12, 25, 50, 100, 150, 200, 250, 300, 350 
or 400 consecutive sense strand nucleotide sequence SEQ ID N0S:1, 3, 5, 7, 9, 1 1, 13, 15, 17, 
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19, 21, 23, or 25; or an anti-sense strand nucleotide sequence of SEQ ID N0S:1, 3, 5, 7, 9, 1 1, 
13, 15, 17, 19,21,23,or25;orofanaturaUyoccurrmgniutantofSEQIDNOS:l,3,5,7,9, 
11, 13, 15, 17, 19, 21, 23, and 25. 

Probes based on the human NOVX nucleotide sequences can be used to detect 

5 transcripts or genomic sequences encoding the same or homologous proteins. In various 

embodiments, the probe further comprises a label group attached thereto, e.g. the label group 
can be a radioisotope, a fluorescent compound, an en2yme, or an enzyme co-factor* Such 
probes can be used as a part of a diagnostic test kit for identifying cells or tissues which mis- 
express an NOVX protein, such as by measuring a level of an NOVX-encoding nucleic acid in 

10 a sample of cells from a subject e.g., detecting NOVX mRNA levels or detennining whether a 
genomic NOVX gene has been mutated or deleted. 

"A polypeptide having a biologically-active portion of an NOVX polypeptide" refers 
to polypeptides ejchibiting activity similar, but not necessarily identical to, an activity of a 
polypeptide of the invention, including mature forms, as measured in a particular biological 

15 assay, with or without dose dependency. A nucleic acid jBragment encoding a "biologically^ 
active portion of NOVX" can be prepared by isolating a portion SEQ ID N0S:1, 3, 5, 7, 9, 11, 
13, 15, 17, 19, 21, 23, or 25, that encodes a polypq>tide having an NOVX biological activity 
(the biological activities of the NOVX proteins are described below), expressing the encoded 
portion of NOVX protein (e.^., by recombinant expression in vitro) and assessing the activity 

20 of the encoded portion of NOVX. 

NOVX Nucleic Acid and Polypeptide Variants 

The invention further encompasses nucleic acid molecules that differ from the 
nucleotide sequences shown in SEQ ID N0S:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, and 25 
due to degeneracy of the genetic code and thus encode the same NOVX proteins as ttiat 

25 encoded by ttie nucleotide sequences shown in SEQ ID N0S:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 
21, 23, and 25. In another embodiment, an isolated nucleic acid molecule of the invention has 
a nucleotide sequence encoding a protein having an amino acid sequence shown in SEQ ID 
N0S:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, and 26. 

In addition to the human NOVX nucleotide sequences shown in SEQ ID NOS: 1,3,5, 

30 7, 9, 1 1, 13, 15, 17, 19, 21, 23, and 25, it will be appreciated by those skilled in the art that 

DNA sequence polymorphisms that lead to changes in the amino acid sequences of the NOVX 
polypeptides may exist within a population (e.g., the human population). Such genetic 
polymorphism in the NOVX genes may exist among individuals within a population due to 
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natural allelic variatiotL As used herein, the terms "gene" and "recombinant gene" refer to 
nucleic acid molecules comprising an open reading frame (ORF) encoding an NOVX protein, 
preferably a vertebrate NOVX protein. Such natural allelic variations can typically result in 
1-5% variance in the nucleotide sequence of the NOVX genes. Any and all such nucleotide 

5 variations and resulting amino acid polymorphisms in the NOVX polypeptides, which are the 
result of natural allelic variation and that do not alter the functional activity of the NOVX 
polypq)tides, are intended to be within the scope of the invention. 

Moreover, nucleic acid molecules encoding NOVX proteins from other species, and 
thus that have a nucleotide sequence that differs from the human SEQ ID N0S:1, 3, 5, 7, 9, 

10 11, 13, 15, 17, 19, 21, 23, and 25 are intended to be within the scope of the invention. Nucleic 
acid molecules corresponding to natural allelic variants and homologues of the NOVX cDNAs 
of the invention can be isolated based on then homology to ftie human NOVX nucldc acids 
disclosed herein using the human cDNAs, or a portion thereof, ^ a hybridization probe 
according to standard hybridization techniques under stringent hybridization conditions. 

1 5 Accordingly, in another embodiment, an isolated nucleic acid molecule of the 

invention is at least 6 nucleotides in length and hybridizes under stringent conditions to the 
nucleic acid molecule comprising the nucleotide sequence of SEQ ID NOS:l, 3, 5, 7, 9, 1 1, 
13, 15, 17, 19, 21, 23, and 25, In another embodiment, the nucleic acid is at least 10, 25, 50, 
100, 250, 500, 750, 1000, 1500, or 2000 or more nucleotides in length. In yet another 

20 OTibodiment, an isolated nucleic acid molecule of the invention hybridizes to the coding 
region, As used herein, the term "hybridizes under stringent conditions" is intended to 
describe conditions for hybridization and washing under which nucleotide sequences at least 
60% homologous to each other typically remain hybridized to each other, 

Homologs (i.e., nucleic acids encoding NOVX proteins derived from species other 

25 than human) or other related sequences (e.g,, paralogs) can be obtained by low, moderate or 
high stringency hybridization with all or a portion of the particular human sequence as a probe 
using methods well known in the art for nucleic acid hybridization and cloning. 

As used herein, the phrase "stringent hybridization conditions" refers to conditions 
under which a probe, primer or oligonucleotide will hybridize to its target sequence, but to no 

30 other sequences. Stringent conditions are sequence-dependent and will be different in 

different circumstances. Longer sequences hybridize specifically at higher temperatures than 
shorter sequences. Generally, stringent conditions are selected to be about 5 °C lower than the 
thermal melting point (Tm) for the specific sequence at a defined ionic sfrength and pH. The 
Tm is the temperature (under defined ionic strength, pH and nucleic acid concentration) at 
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which 50% of the probes complementary to the target sequence hybridize to the target 
sequence at equilibrium. Since the target sequences are generally present at excess* at Tm* 
50% of the probes are occupied at equilibrium. Typically, stringent conditions will be those in 
which the salt concentration is less than about 1 .0 M sodium ion, typically about 0.01 to 1.0 M 
5 sodium ion (or other salts) at 

pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes, primers or 
oligonucleotides (e.g., 10 nt to 50 nt) and at least about 60°C for longer probes, primers and 
oligonucleotides. Stringent conditions may also be achieved with the addition of destabilizing 
agents, such as formamide. 

10 Stringent conditions are known to those skilled in the art and can be found in Ausubel, 

et al, (eds.), CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, N.Y. 
(1989), 6,3,1-6.3.6. Preferably, the conditions are such that sequences at least about 65%, 
70%, 75%, 85%, 90%, 95%, 98%, or 99% homologous to each djther typically remain 
hybridized to each other* A non-Umiting example of stringent hybridization conditions are 

1 5 hybridization in a high salt buffer comprising 6X SSC, 50 mM Tris-HCl (pH 7.5), 1 mM 

EDTA, 0.02% PVP, 0.02% FicoU, 0.02% BSA, and 500 mg/ml denatured salmon spenn DNA 
at 65*^C, followed by one or more washes in 0.2X SSC, 0.01% BSA at 50°C. An isolated 
nucleic acid molecule of the invention that hybridizes under stringent con<htions to the 
sequences SEQ ID N0S:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, and 25, corresponds to a 

20 naturally-occurring nucleic acid molecule. As used herein, a "naturally-occurring" nucleic 

acid molecule refers to an RNA or DNA molecule having a nucleotide sequence that occurs in 
nature {e.g.^ encodes a naturd protein)* 

In a second embodiment, a nucleic acid sequonice that is hybridizable to the nucleic 
acid molecule comprising the nucleotide sequence of SEQ ID N0S:1, 3, 5, 7, 9, 11, 13, 15, 17, 

25 19, 21, 23, and 25, or ftagments, analogs or derivatives thereof, under conditions of moderate 
stringency is provided. A non-linaiting example of moderate stringency hybridization 
conditions are hybridization in 6X SSC, 5XDenhardt's solution, 0.5% SDS and 100 mg/ml 
denatured sahnon sperm DNA at 55°C, followed by one or more washes in IX SSC, 0.1% 
SDS at 37**C. Other conditions of moderate stringency that may be used are well-known 

30 within the art. See, e.g., Ausubel, et al (cds,), 1993, CURRENT Protocols in Molecular 
Biology, John Wiley & Sons, NY, andKriegler, 1990; GENE TRANSFER AND EXPRESSION, A 
Laboratory Manual, Stockton Press, NY, 

In a third embodiment, a nucleic acid that is hybridizable to the nucleic acid molecule 
comprising the nucleotide sequences SEQ E) N0S:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, and 
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25, or fragments, analogs or derivatives thereof, under conditions of low stringency, is 
provided. A non-limiting ©tample of low stringency hybridisation conditions are 
hybridization in 35% formamide, 5X SSC, 50 mM Tris-HCl (pH 7.5), 5 niM EDTA, 0.02% 
PVP, 0.02% FicoU, 0.2% BSA, 100 mg/ral denatured salmon sperm DNA, 10% (wt/vol) 
5 dextran sulfate at 40°C, followed by one or more washes in 2X SSC, 25 mM Tris-HCl (pH 
7.4), 5 mM EDTA, and 0.1% SDS at 50°C. Other conditions of low stringency that may be 
used are well known in the art (e.g., as employed for cross-species hybridizations). See, e.g., 
Ausubel, etal (eds.), 1993, CuRItENT PROTOCOLS IN MOLECULAR BIOIJ^ John Wiley & 
Sons, NY, and Kriegler, 1 990, Gene Transfer and Expression, A Laboratory Mj^rjal, 
10 Stockton Press, NY; Shilo and Wemberg, 1981. Proc Natl Acad Sci USA 78: 6789-6792. 

Coaservative Mutations 

In addition to naturally-occurring ^lelic variants of NOV|C sequences that may eocist in 
the population, the skilled artisan will further appreciate that changes can be introduced by 

15 mutation into the nucleotide sequences SEQ ID N0S:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 
and 25, thereby leading to changes in the amino acid sequences of the encoded NOVX 
proteins, without altering the functional ability of said NOVX protems. For example, 
nucleotide substitutions leachng to amino acid substitutions at "non-essential" anuno acid 
residues can be made in the sequence SEQ ID N0S:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 

20 and 26. A "non-essoitial" amino acid residue is a residue that can be altered from the 

wild-type, sequences of the NOVX proteins without altering their biological activity, whereas 
an "essential" amino acid residue is required for such biological activity. For example, amino 
acid residues that are conserved among the NOVX proteins of the invention are predicted to be 
particularly non-amenable to alteration, Amino acids for which conservative substitutions can 

25 be made are well-lmown within the art. 

Another aspect of the invention pertains to nucleic acid molecules encoding NOVX 
proteins that contain changes in amino acid residues that are not essential for activity. Such 
NOVX proteins differ in amino acid sequence from SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 16, 18, 
20, 22, 24, or 26 yet retain biological activity. In one embodiment, the isolated nucleic acid 

30 molecule comprises a nucleotide sequence encoding a protein, wherein the protein comprises 
an amino acid sequence at least about 45% homologous to the amino acid sequences SEQ ID 
N0S:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, or 26. Preferably, the protein encoded by the 
nucleic acid molecule is at least s^out 60% homologous to SEQ ID N0S:2, 4, 6, 8, 10, 12, 14, 
16, 18, 20, 22, 24, or 26; more preferably at least about 70% homologous SEQ ID N0S:2, 4, 
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6, 8, 10, 12, 14, 16, 18, 20, 22, 24, and 26; still more preferably at least about 80% 
homologous to SEQ ID N0S:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, and 26; even more 
preferably at least about 90% homologous to SEQ ID N0S:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 

22, 24, and 26; and most preferably at least about 95% homologous to SEQ ID N0S:2, 4, 6, 8, 
5 10, 12, 14, 16, 18, 20, 22, 24, and 26. 

An isolated nucleic acid molecule encoding an NOVX protein homologous to the 
protein of SEQ ID N0S:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, and 26 can be created by 
introducing one or more nucleotide substitutions, additions or deletions into the nucleotide 
sequence of SEQ ID N0S:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, and 25, such that one or 

10 more amino acid substitutions, additions or deletions are introduced into the ^coded protein. 

Mutations can be introduced into SEQ ID N0S;2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 
and 26 by standard techniques, such as site-directed mutagenesis and PCR-mediated 
mutagenesis. Preferably, conservative amino acid substitutions !are made at one ot more 
predicted, non-essential amino acid residues, A "conservative amino acid substitution" is one 

1 5 in which the amino acid residue is replaced with an amino acid residue having a similar side 
chain. Famihes of amino acid residues having similar side chains have been defined within 
the art. These families include amino acids with basic side chains (e.^., lysine, arginine, 
histidine), acidic side chains (e.g., aapartic add, glutamic acid), uncharged polar side chains 
(e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, cysteine), nonpolar, side 

20 chains (e.g., alanine, valine, leucme, isoleucine, prolme, phenylalanine, metitdonine, 

tryptophan), beta-branched side chains (e.g, threonine, valine, isoleucine) and aromatic side 
chains (ag., tyrosine, phenylalanine, tryptophan, histidine). Hius, a predicted non-essential 
amino acid residue in the NOVX protein is replaced with another amino acid residue from the 
same side chain family. Alternatively, in another embodiment, mutations can be introduced 

25 randomly along all or part of an NOVX coding sequence, such as by saturation mutagenesis, 
and the resultant mutants can be screened for NOVX biological activity to identify mutants 
that retain activity. Following mutagenesis SEQ ID NOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 

23, and 25, tiie encoded protein can be expressed by any recombinant technology known in the 
art and the activity of the protein can be determined. 

30 The relatedness of amino acid famihes may also be detennined based on side chain 

interactions. Substituted amino acids may be fully conserved "strong" residues or fully 
conserved Veakf' residues. The "strong" group of conserved ammo acid residues may be any 
one of the following groups: STA, NEQK, NHQK, NDEQ, QHRK, MILV, MILF, HY, FYW, 
wherem the single letter amino acid codes are grouped by those amino acids that may be 
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substituted for each other. Likewise, the Veak'' group of conserved residues may be any one 
of the following: CSA, ATV, SAO, Sim, STPA, SGND, SNDEQK, NDEQHK, NEQHRK, 
VLM, HFY, wherem the letters within each group represent the single letter amino acid code. 

In one embodiment, a mutant NOVX protein can be assayed for (i) the abiUty to form 
5 protein:protein interactions with other NOVX proteins, other celUsuiface proteins, or 

biologically-active portions thereof, (it) complex formation between a mutant NOVX protein 
and an NOVX ligand; or (iii) the ability of a mutant NOVX protem to bind to an intracellular 
target protein or biologically-active portion thereof; (eg, avidin protems). 

hi yet another embodiment, a mutant NOVX protein can be assayed for the ability to 
10 regulate a specific biological function ie.g., regulation of insulin release). 

Antisense Nucleic Acids 

Another aspect of the invention pertains to isolated anti^eiise nucleic acid molecules 
that are hybridizable to or complementary to the nucleic acid molecule comprising the 
nucleotide sequence of SEQ ID N0S;1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, and 25, or 

15 fragments, analogs or derivatives thereof An "antisense" nucleic acid comprises a nucleotide 
sequence that is complementary to a "sense" nucleic acid encoding a protein (e.g,, 
complementary to the cochng strand of a double-stranded cDNA molecule or complementary 
to an mRNA sequence), In specific aspects, antisense nucleic acid molecules are provided that 
comprise a sequence complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides 

20 or an entire NOVX coding st^ftpd, pr to only a portion thereof Nucleic acid molecules 
encoding fragments, homologs, derivatives and analogs of an NOVX protein of SEQ ID 
N0S:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, and 26, or antisense nucleic acids 
complementary to an NOVX nucleic acid sequence of SEQ ID N0S:1, 3, 5, 7, 9, 11, 13, 15, 
17, 19, 21, 23, and 25, are additionally provided. 

25 In one embodiment, an antisense nucleic acid molecule is antisense to a "coding 

region" of the coding strand of a nucleotide sequence encoding an NOVX protein. The term 
"coding region" refers to the region of the nucleotide sequence comprising codons which are 
translated into amino acid residues. In another embodiment, the antisense nucleic acid 
molecule is antisense to a "noncoding region" of the coding strand of a nucleotide sequence 

30 encoding the NOVX protein. The term "noncodmg region" refers to 5* and 3* sequences which 
flank the coding region that are not translated into amiao acids (ie., also referred to as 5' and 
3' untranslated regions). 
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Given the coding strand sequences encoding the NOVX protein disclosed herein^ 
antisense nucleic acids of the invention can be designed according to the rules of Watson and 
Crick or Hoogsteen base pairing. The antisense nucleic acid molecule can be complementary 
to the entire coding region of NOVX mRNA, but more preferably is an oligonucleotide that is 
antisense to only a portion of tiie coding or noncoding region of NOVX mRNA. For example, 
the antisense oligonucleotide can be complementary to the region surrounding the translation 
start site of NOVX mRNA. An aDtisense oligonucleotide can be, for example, about 5, 10, 15, 
20, 25, 30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention 
can be constructed using chemical synthesis or enzymatic ligation reactions using procedures 
known in the art. For example, an antisense nucleic acid {e.g.y an antisense oligonucleotide) 
can be chemically synthesized usmg naturally-occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of ttie molecules or to increase the 
physical stability of the duplex formed between the antisense and sense nucleic acids (ag., 
phosphorothioate derivatives and acridine substituted nucleotides can be used). 

Examples of modified nucleotides that can be used to generate tiie antisense nucleic 
acid include: 5-fluorouracil, 5-biomouracil, 5-chlorouracil, 54odouracil, hypoxanthine, 
xanttune, 4-acetylcytosine, 5-(carboxyhydroxyhnethyl) uracil, S-carboxymethylaminomethyl-' 
2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, 
inosine,N6-isopentenyladenine, l-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 
2-methyladHiine, 2-methylguanine, 3-methylcytosinc, 5-methylcytoshie, N6-adenine, 
7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-1toiouracil, 
beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 

2- metoylfhio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5-methyl-2-thiouraca, 2-thiouracil, 4-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-dianiinopurine. Alternatively, the 
antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (z.e., RNA transcribed fi^om the 
mserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described fiurther in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 
subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
genomic DNA encoding an NOVX protein to thereby inhibit expression of the protein (e^g,, by 
inhibiting transcription and/or translation). The hybridization can be by conventional 
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nucleotide complementarity to fonn a stable duplex, or, for example, in the case of an 
antisense nucleic add molecule that hinds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 
nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 

5 antisense nucleic acid molecules can be modified to target selected cells and thra administered 
systemically. For example, for systemic administration, antisense molecules can be modified 
such that they specifically bind to receptors or antigois expressed on a selected cell surface 
(e.g.y by linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell 
suffece receptors or antigens). The antisense nucleic acid molecules can also be delivered to 

1 0 cells using the vectors described herein. To achieve sufficient nucleic acid molecxiles, vector 
constructs in which the antisense nucleic acid molecule is placed under the control of a strong 
pol n or pol ni promoter are preferred. 

In yet another embodiment, the antisense nucleic acid noblecule of the invention is an 
a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 

1 5 doubk-stranded hybrids with complementary RNA in which, contrary to the usual p-units, the 
strands run parallel to each other See, e.g., Gaultier, et al, 1987, NucL Acids Res, 15: 
6625-6641 . The antisense nucleic acid molecule can also comprise a 
2 -o-methylribonucleotide (see, e.g., Inoue, et ah 1987, Nucl Acids Res. 15: 613 1-6148) or a 
chimeric RNA-DNA analogue (see, e.g., Inoue, et al, 1987. FEES Lett, 215: 327-330. 

20 

Rlbozymes and PNA Moieties 

Nucleic acid modifications include, by way of non-limiting example, modified bases, 
and nucleic acids whose sugar phosphate backbones are modified or derivatized. TTiese 
modifications are carried out at least in part to enhance the chemical stabiUty of the modified 

25 nucleic acid, such that they may be used, for example, as antisense binding nucleic acids in 
therapeutic appUcations in a subject, 

In one embodiment, an antisense nucleic acid of the invention is a ribozyme, 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are cs^able of 
cleaving a single-stranded nucleic acid, such as an mRNA, to which they have a 

30 complementary region. Thus, ribozymes {e.g. , hammerhead ribozymes as described in 

HaselhofFand Gerlach 1988. Nature 334: 585-591) can be used to catalytically cleave NOVX 
mRNA transcripts to thereby inhibit translation of NOVX mRNA, A ribozyme having 
specificity for an NOVX-encoding nucleic acid can be designed based upon the nucleotide 
sequence of an NOVX cDNA disclosed herein (i.e., SEQIDNOS:!, 3, 5, 7, 9, 11, 13, 15, 17, 
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19, 21, 23, and 25). Por example, a derivative of a Tetrahymena L-19 IVS RNA can be 
constructed in which the nucleotide sequence of the active site is complementary to the 
nucleotide sequence to be cleaved in an NOVX-encoding mRNA. See, U.S. Patent 
4,987,071 to Cech, et al and U.S. Patent 5,1 16,742 to Cech, et al. NOVX mRNA can also be 
5 used to select a catalytic RNA having a specific ribonuclease activity from a pool of RNA 
molecules. See, Bartel et al, (1993) Science 261: 141 1-1418. 

Alternatively, NOVX gene expression can be inhibited by targeting nucleotide 
sequences complementary to the regulatory region of the NOVX nucleic acid (e.g., the NOVX 
promoter and/or enhancers) to form triple helical structures that prevent transcription of the 

10 NOVX gene in target cells. See, e.g., Helene, 1991, Anticancer Drug Des. 6: 569-84; Helene, 
et al 1992. Ann. KY. Acad. ScL 660: 27-36; Maher, 1992. Bioas:says 14: 807-15. 

In various embodiments, the NOVX nucleic acids can be modified at the base moiety, 
sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubiUty 
of the molecule. For example, the deoxyribose phosphate backbone of the nucleic acids can 

15 be modified to generate peptide nucleic acids. See, e.g., Hyrup, et aly 1996. BioorgMed 
Chem 4; 5-23. As used herein, the terras "peptide nucleic acids" or "PNAs" refer to nucleic 
acid mimics (^.g., DNA mimics) in which the deoxyribose phosphate backbone is replaced by 
a pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 

20 conditions of low ionic strength. The synthesis of PNA oligomers can be performed usmg 
standard solid phase peptide synthesis protocols as described in Hyrup, et^l^ 1996. supra\ 
Perry-O'Keefe, et al, 1996. Proc. Natl Acad. ScL USA 93: 14670-14675. 

PNAs of NOVX can be used in therapeutic and diagnostic applications. For example, 
PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene 

25 eocpression by, e.g., inducing transcription or translation arrest or inhibiting rq>lication. PNAs 
of NOVX can also be used, for example, in the analysis of single base pair mutations in a gene 
(e.g., PNA directed PGR clamping; as artificial restriction enzymes when used in combination 
with other enzymes, e.g., Si nucleases (see, Hyrup, et al, 1996,supra); or as probes or primers 
for DNA sequence and hybridization (see, Hyrup, et al, 1996, supra; Perry-O'Keefe, et al, 

30 1996.5Mpra). 

In another embodiment, PNAs of NOVX can be modified, e.g., to enhance their 
stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
deUvery known in the art. For example, PNA-DNA chimeras of NOVX can be generated that 
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may combine the advantageous properties of PNA and DNA, Such chimeras allow DNA 
recogoition enzymes (e.g. , RNase H and DNA polymerases) to interact with the DNA portion 
while the PNA portion would provide high binding affinity and specificity. PNA-DNA 
chimeras can be Unked using linkers of ^propriate lengths selected in terms of base stacking, 
5 number of bonds between the nucleobases, and orientation {see, Hyrup, et al», 1996* supra). 
The synthesis of PNA-DNA chimeras can be performed as described in Hyrup, et al, 1996. 
supra and Finn, et aL, 1996. Nucl Acids Res 24: 3357-3363. For example, a DNA chain can 
be synthesized on a solid support using standard phosphoramidite couphng chemistry, and 
modified nucleoside analogs, e.g. , 5*-(4-methoxytrityl)ariiino-5 --deoxy-thymidine 

10 phosphoramidite, can be used between the PNA and the 5* end of DNA. See, e,g, Mag, et al^ 
1989. Nucl Acid Res 17: 5973-5988. PNA monomers are then coupled in a stepwise manner 
to produce a chmieric molecule with a 5' PNA segment and a 3* DNA segment See, e.g., , 
Finn, et al, 1996* supra. Alternatively, chimeric molecules can jbe synthesized with a 5' DNA 
segment and a 3' PNA segment See, e.g, Petersen, et al, 1975. Bioorg. Med. Chem, Lett. 5; 

15 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides {e,gy for targeting host cell receptors in vrvo), or agents faciUtating transport across 
the cell membrane (see, e.g.y Letsinger, et al.^ 1989. Proc. Natl Acad. Set U.S.A. 86: 
6553-6556; Lemaitre, et al, 1987. Proc. Natl Acad ScU 84: 648-652; PCX Publication No. 

20 WO88/09810) or the blood-brain barrier (see, e.g, PCX Publication No, WO 89/10134). In 
addition, oligonucleotides can be modified with hybridization triggered cleavage agents (see, , 
e.g., Krol, et al, 1988. BioTechniques 6:958-976) or mtercalating agents (see, e.g, Zon, 1988. 
Pharm. Res. 5: 539-549), Xo this end, the oligonucleotide may be conjugated to another 
molecule, e.g., a peptide, a hybridization triggered cross-linkmg agent, a transport agent, a 

25 hybridization-triggered cleavage agent, and the like. 

NOVX Polypeptides 

A polypeptide according to the invention includes a polypeptide including the amino 
acid sequence of NOVX polypeptides whose sequences are provided in SEQ ID NOS:2, 4, 6, 
8, 10, 12, 14, 16, 18, 20, 22, 24, and 26. Xhe invention also includes a mutant or variant 
30 protein any of whose residues may be changed from the corresponding residues shown in SEQ 
ID N0S:2,' 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, and 26 while still encoding a protein that 
maintains its NOVX activities and physiological functions, or a fimctional fragment thereof 
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In general, an NOVX variant that preserves NOVX-like function includes any variant 
in which residues at a particular position in the sequence have been substituted by other anaino 
acids, and fiirther include the possibility of inserting an additional residue or residues between 
two residues of the parent protein as well as the possibility of deleting one or more residues 
5 from the parent sequence. Any amino acid substitution, insertion, or deletion is encompassed 
by the invention. In favorable circumstances, the substitution is a conservative substitution as 
defined above. 

One aspect of the invention p^tains to isolated NOVX proteins, and biologically- 
active portions thereof, or derivatives, fragments, analogs or homologs th^eof Also {Provided 

10 ^e polypeptide fragments suitable for use as innnunogens to raise anti-NOVX antibodies. In 
one embodiment, native NOVX proteins can be isolated 6om cells or tissue sources by an 
appropriate purification scheme using standard protein purification techniques. In another 
embodiment, NOVX proteins are produced by recombinant DN^ techniques. Alternative to 
recombinant expression, an NOVX protein or polypeptide can be synthesized chemically 

15 using standard peptide synthesis techniques. 

An "isolated" or "purified" polypeptide or protein or biologically-active portion thereof 
is substantially free of cellular material or other contaminating proteins from the cell or tissue 
source from which the NOVX protein is derived, or substantially free from chemical 
precursors or other chemicals when chemically synthesized. The language *'substantially firee 

20 of cellular mat^al" mcludes preparations of NOVX proteins in which the protein is separated 
from cellular components of the cells from which it is isolated or recombinantly-produced. In 
one embodiment, the language "substantially free of cellular material" includes preparations of 
NOVX proteins having less than about 30% (by dry weight) of non-NOVX proteins (also 
referred to herein as a "contaminating protein"), more preferably less than about 20% of 

25 non-NOVX proteins, still mate preferably less than about 10% of non-NOVX proteins, and 
most preferably less than about 5% of non-NOVX proteins, When the NOVX protein or 
biologically-active portion thereof is recombinantly-produced, it is also preferably 
substantially free of culture medium, le.y culture medium represents less than about 20%, 
more preferably less ttian about 10%, and most preferably less than about 5% of the volume of 

30 the NOVX protein preparation. 

The language "substantially free of chemical precursors or other chemicals" includes 
preparations of NOVX proteins in which the protein is separated Sx>m chemical precursors or 
other cheanicals that are involved in the synthesis of the protein. In one embodiment, the 
language "substantially free of chemical precursors or other chemicals" mcludes pr^arations 
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of NOVX proteins having less than about 30% (by dry weight) of chemical precursoi^ or 
non-NOVX chemicals, more preferably less than about 20% chemical precursors or 
non-NOVX chemicals, still more preferably less than about 10% chemical precursors or 
non-NOVX chemicals, and most preferably less than about 5% chemic^ precursors or 
5 non-NOVX chemicals. 

Biologically-active portions of NOVX proteins include peptides comprising amino 
acid sequences sufficioitly homologous to or derived from ihe amino acid sequences of the 
NOVX proteins (e.g., the amino acid sequence shown in SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 
16, 18, 20, 22, 24, and 26) that include fewer amino acids than the MHength NOVX proteins, 

1 0 and exhibit at least one activity of an NOVX protein. Typically, biologicaUy-active portions 
comprise a domain or motif with at least one activity of the NOVX protein. A biologically- 
active portion of an NOVX protein can be a polypeptide which is, for example, 10, 25, 50, 100 
or more amino acid residues in length. ( 

Moreover, other biologically-active portions, in which otiier regions of the protein are 

15 deleted, can be prepared by recombinant techniques and evaluated for one or more of the 
functional activities of a native NOVX proteia 

In an embodiment, the NOVX protein has an amino acid sequence shown SEQ ID 
N0S:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, and 26. In other embodiments, the NOVX 
protein is substantially homologous to SEQ ID N0S:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 

20 and 26, and retains the functional activity of the protein of SEQ ID N0S:2, 4, 6, 8, 10, 12, 14, 
16, 18, 20, 22, 24, and 26, yet differs in amino acid sequence due to natural allelic variation or 
mutagenesis, as described in detail, below. Accordingly, in anotheir embodiment, the NOVX 
protein is a protein that comprises an amino acid sequence at least about 45% homologous to 
the amino acid sequence SEQ ID N0S:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, and 26, and 

25 retains the functional activity of the NOVX proteins of SEQ ID N0S:2, 4, 6, 8, 10, 12, 14, 16, 
18, 20, 22, 24, and 26. 

Determining Homology Between Two or More Sequences 

To determine the percent homology of two amino acid sequences or of two nucleic 
30 acids, the sequences are aligned for optimal comparison purposes {e.g*, gaps can bo introduced 
in the sequence of a first amino acid or nucleic acid sequence for optimal alignment with a 
second amino or nucleic acid sequence). The amino acid residues or nucleotides at 
corresponding amino acid positions or nucleotide positions are then compared. When a 
position in the first sequence is occi^jied by the same amino acid residue or nucleotide as the 
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correspondiag position in the second sequence, then the molecules are homologous at that 
position (i.e., as used herein amino acid or nucleic acid *Taomology" is equivalent to amino 
acid or nucleic acid "identity"). 

The nucleic acid sequence homology may be deteimined as the degree of identity 
5 between two sequences. The homology may be determined using computer programs known 
in the art, such as GAP software provided m the GCG program package. See, Needlemau and 
Wunsch, 1970. JMolBiol 48: 443-453, Usmg GCG GAP software with the following settings 
for nucleic acid sequence comparison: GAP creation penalty of 5.0 and GAP extension 
penalty of 0.3 > the coding region of the analogous nucleic acid sequences referred to ai)ove 

10 exhibits a degree of identity preferably of at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 
99%, with the CDS (encodmg) part of the DNA sequence shown in SEQ ID N0S:1, 3, 5, 7, 9, 
11,13,15, 17, 19, 21, 23, and 25. 

The term "sequence identity** refers to the degree to whicli two polynucleotide or 
polypeptide sequences are identical on a residue-by-residue basis over a particular region of 

15 comparison, llie term •^coitage of sequ^ce identity'* is calculated by comparing two 
optimally aligned sequences over that region of comparison, determining the number of 
positions at which the identical nucleic acid base (eg., A, T, C, G, U, or Ij in the case of 
nucleic acids) occurs in both sequences to yield the number of matched positions, dividing the 
number of matched positions by the total number of positions in the region of comparison {le., 

20 the window size), and multiplying the result by 1 00 to yield the percentage of sequence 
identity. The term "substantial identity*' as used herein denotes a characteristic of a 
polynucleotide sequence, wherein the polynucleotide comprises a sequence that has at least 80 
percent sequaice identity, preferably at least 85 percent identity and often 90 to 95 percent 
sequence identity, more usually at least 99 percent sequence identity as compared to a 

25 reference sequence over a comparison region. 

Chimeric and Fusion Proteins 

The invention also provides NOVX chimeric or fiision proteins. As used herein, an 
NOVX "chimeric protein" or "ftision protein" comprises an NOVX polypeptide operatively- 
30 hnked to a non-NOVX polypeptide. An "NOVX polypeptide" refers to a polypeptide having 
an amino acid sequence corresponding to an NOVX protein SEQ ID N0S:2, 4, 6, 8, 10> 12, 
14, 16, 18, 20, 22, 24, and 26), whereas a "non-NOVX polypeptide" refers to apolypeptide 
having an amino acid sequence corresponding to a protein that is not substantially homologous 
to the NOVX protein, e.g-. , a protein that is different ftom the NOVX protein and that is 
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derived from the same or a different organism. Within an NOVX fixsion protein the NOVX 
polypeptide can correspond to all or a portion of an NO VX protein. In one embodiment, an 
NOVX fusion protein comprises at least one biologically-active portion of an NOVX protein. 
In another embodiment, an NOVX fusion protein comprises at least two biologically-active 
5 portions of an NOVX protein. In yet another embodiment, an NOVX fusion protein 

comprises at least three biologically-active portions of an NOVX protein. Within the fusion 
protein, the term "operatively-linked" is intended to indicate that the NOVX polypeptide and 
the non-NOVX polypeptide are fused in-frame with one another. The non-NOVX polypeptide 
can be fused to the N-terminus or C-terminus of the NOVX polypeptide. 

1 0 In one embodiment, the fusion protein is a GST-NO VX fusion protein in which the 

NOVX sequences are fused to the C-temiinus of the GST (glutathione S-transferase) 
sequences. Such fusion proteins can facilitate the purification of recombinant NOVX 
polypeptides. t 

In anoth^ embodiment, the fusion protein is an NOVX protein containing a 

15 heterologous signal sequence at its N-terminus. In certain host cells manunalian host 
cells), expression and/or secretion of NOVX can be increased through use of a heterologous 
signal sequence. 

In yet another embodiment, tiie fusion protein is an NOVX-immunoglobulin fusion 
protein in which the NOVX sequences are fiised to sequences derived from a member of the 

20 immunoglobulin protein family. The NOVX-immunoglobulin fusion proteins of the invention 
can be incorporated into phamiaceutical compositions and administered to a subject to inhibit 
an interaction between an IsfOVX ligand and an NOVX protem on the surface of a cell, to 
ttiereby suppress NOVX-mediated signal transduction in vivo. Tho NOVX-immunoglobulin 
fusion proteins can be used to affect the bioavailability of an NOVX cognate ligand. 

25 Inhibition of the NOVX ligand/NOVX interaction may be usefiil therapeutically for both the 
treatment of proliferative and differentiative disorders, as well as modulating (e.g. promoting 
or inhibiting) cell survival. Moreover, the NOVX-immunoglobulin fusion proteins of the 
invention can be used as immunogens to produce anti-NOVX antibodies in a subject^ to purify 
NOVX ligands, and in screening assays to identify molecules that inhibit the interaction of 

30 NOVX with an NOVX ligani 

An NOVX chimeric or fusion protein of the invention can be produced by standard 
recombinant DNA techniques. For example, DNA fragments coding for the different 
polypeptide sequences are ligated together in-frame in accordance with conventional 
techniques, e.g., by employing blunt-ended or stagger-ended termini for hgation, restriction 
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eii2ynie digestion to provide for appropriate termini, filling-in of cohesive ends as appropriate, 
alkaline phosphatase treatment to avoid undesirable joining, and enzymatic ligation. In 
another embodiment, the fusion gene can be synthesiized by conventional techniques mcluding 
automated DNA synthesizers. Alternatively, PGR amplification of gene fragments can be 
5 carried out using anchor primers that give rise to complementaty overhangs between two 
consecutive gene fragments that can subsequently be annealed and reamplified to generate a 
chimeric gene sequence (see, e.g., AusubeL, etaL (eds.) CuRM^PROTCK:oLSm MOLECULAR 
Biology, John Wiley & Sons, 1992), Moreover, many expression vectors are commercially 
available that already encode a fusion moiety (e.g,^ a GST polypeptide). An NOVX-ehcoding 
10 nucleic acid can be cloned into such an expression vector such that the fusion moiety is hnked 
in-frame to the NOVX protein. 

NOVX Agonists and Antagonists ( 

The invention also pertains to variants of the NOVX proteins that fimction as either 

15 NOVX agonists (i.e, mimetics) or as NOVX antagonists* Variants of the NOVX protein can 
be generated by mutagenesis (e.g., discrete point mutation or truncation of the NOVX protein). 
An agonist of the NOVX protein can retain substantially the same, or a subset of, the 
biological activities of the naturally occurring form of the NOVX protein, An antagonist of 
the NOVX protein can inhibit one or more of the activities of the naturally occurring form of 

20 tibie NOVX protein by, for example, competitively binding to a downstream or upstream 
member of a cellular signaling cascade which includes the NOVX protem. Thus, specific 
biological effects can be elicited by treatment with a variant of limited function. In one 
embodiment, treatment of a subject with a variant having a subset of the biological activities 
of the naturally occurring form of the protein has fewer side effects in a subject relative to 

25 treatment with the naturally occurring form of the NOVX proteins. 

Variants of the NOVX protems tihat function as either NOVX agonists (f.e, mimetics) 
or as NOVX antagonists can be identified by screening combiDatorial Ubraries of mutants 
(e.g., truncation mutants) of the NOVX proteins for NOVX protein agonist or antagonist 
activity. In one embodiment, a variegated library of NOVX variants is generated by 

30 combinatorial mutagenesis at the nucleic acid level and is encoded by a variegated gene 
library. A variegated library of NOVX variants can be produced by, for example^ 
enzymatically Ugating a mixture of synthetic oKgonucleotides into gene sequences such that a 
degenerate set of potential NOVX sequences is expressible as individual polypeptides, or 
alternatively, as a set of larger fusion proteins (eg,, for phage display) containing the set of 
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NOVX sequences therein. There are a variety of methods which can be used to produce 
libraries of potential NOVX variants from a degenerate ohgonucleotide sequence. Chemical 
synthesis of a degenerate gene sequence can be performed in an automatic DNA synthesizer, 
and the synthetic gene then ligated into an appropriate expression vector Use of a degenerate 
set of genes allows for the provision, in one mixture, of all of the sequences encoding the 
desired set of potential NOVX sequences. Methods for synthesizing degenerate 
oligonucleotides are well-known within the art. See, Narang, 1983. Tetrahedron 39: 3; 
Itakura, et aL, 1984, Amu, Rev, Biochem. 53: 323; Itakura, et al, 1984. Science 198: 1056; 
Dee, et al, 1983. Nuci Acids Res. 1 1 : 477, 

Polypeptide Libraries 

In addition, libraries of fragments of the NOVX protein coding sequences can be used 
to generate a variegated population of NOVX fragments for screwing and subsequent 
selection of variants of an NOVX protein. In one embodiment, a library of coding sequence 
fragments can be generated by treating a double stranded PGR fragment of an NOVX coding 
sequence with a nuclease under conditions wherein nicking occurs only about once per 
molecule, denaturing the double stranded DNA, renaturing the DNA to form double-stranded 
DNA that can include sense/antisense pairs from different nicked products, removing single 
stranded portions from reformed duplexes by treatment with Si nuclease, and ligating the 
resulting fragment library into an expression vector. By this method, expression libraries can 
be derived which encodes N-terminal and internal fragments of various sizes of the NOVX 
proteins* 

Various techniques are known in the art for screening g©ue products of combinatorial 
libraries made by point mutations or truncation, and for screening cDNA libraries for gene 
products having a selected property. Such techniques are adaptable for rapid screening of the 
gene libraries generated by the combinatorial mutagenesis of NOVX pioteuis. The most 
widely used techniques, which are amenable to high throughput analysis, for screening large 
gene hbrarics typically include cloning the gene library iuto repUcable expression vectois, 
transforming appropriate cells with the resulting library of vectors, and expressing the 
combinatorial genes under conditions in which detection of a desired activity faciUtates 
isolation of the vector encoding the gene whose product was detected Recursive ensemble 
mutagenesis (REM), a new technique that enhances the frequency of functional mutants in the 
libraries, can be used in combination with the screening assays to identify NOVX variants. 
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See, e,g., Axkin and Yourvan, 1992, Proa Natl Acad, ScL USA 89: 7811-7815; Delgrave, et 
01,1993. Protein Engineering 6:327-331, 



Anti-NOVX Antibodies 

The inventioii encompasses antibodies and antibody fragments, such as Fab or (Fab)2, 
5 that bind immunospecifically to any of the NOVX polypeptides of said invention* 

An isolated NOVX protein, or a portion or fragment thereof, can be nsed as an 
immunogen to generate antibodies that bind to NOVX polypeptides using standard techniques 
for polyclonal and monoclonal antibody preparation. The fiill-length NOVX proteins can be 
used or, alternatively, the invention provides antigenic peptide fragments of NOVX proteins 

10 for use as immunogens. The antigenic NOVX peptides comprises at least 4 amino acid 

residues of the amino acid sequence shown SEQ ID N0S:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 
24, and 26 and encompasses an epitope of NOVX such that an a^itibody raised agamst the 
peptide forms a specific immune complex with NOVX. Preferably, the antigenic peptide 
comprises at least 6, 8, 10, 15, 20, or 30 amino acid residues. Longer antigenic peptides are 

15 sometimes preferable over shorter antigenic peptides, depending on use and according to 
methods well known to someone skilled in the art. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of NOVX that is located on the surface of the protein (e.g., a 
hydrophilic region). As a means for targeting antibody production, hydropathy plots showng 

20 regions of hydrophilicity and hydrophpbicity may be generated by any method well known in 
the art, including, for example, the Kyte Doolittle or the Hopp Woods methods, either with or 
without Fourier transformation (see, e.g., Hopp and Woods, 1981. Proc. Nat, Acad. Sci, USA 
78: 3824-3828; Kyte and Doolittle, 1982. J. Mol Biol 157: 105-142, each mcoiporated herein 
by reference in their entirety)* 

25 As disclosed herein, NOVX protein sequences of SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 

16, 18, 20, 22, 24, 26, or derivatives, fragments, analogs or homologs thereof, may be utiUzed 
as immunogens in the generation of antibodies that immunospecificaUy-bind these protein 
components. The term "antibody" as used herein refers to immunoglobulin molecules and 
immunologically-active portions of immunoglobulin molecules, /.c, molecules that contain an 

30 antigen binding site that specifically^binds (immunoreacts with) an antigen, such as NOVX. 
Such antibodies include, but are not limited to, polyclonal, monoclonal, chimeric, single chain. 
Fab and F(ab')2 fragments, and an Fab expression library. In a specific embodiment, antibodies to 
human NOVX proteins are disclosed. Various procedures kaown within the art may be used 
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for the production of polyclonal or monoclonal antibodies to an NOVX protein sequence of 
SEQ ID N0S:2, 4, 6, 8, 10, 12, 14, 16, 18> 20, 22, 24, 26, or a derivative, ^agment, analog or 
homolog thereof. Some of these proteins are discussed below. 

For the production of polyclonal antibodies, various suitable host animals (^.g., rabbit, 
5 goat, mouse or other mammal) may be immunized by injection with the native protein, or a 
synthetic variant thereof, or a derivative of the foregoing. An appropriate immunogenic 
preparation can contain, for example, recombinantly-expressed NOVX protein or a 
chemically-synthesized NOVX polypeptide. The preparation can ftirther include an adjuvant. 
Various adjuvants used to increase the immraological response include, but are not limited to, 

10 Freund's (complete and incomplete), mineral gels (e,g*, aluminum hydroxide), surface active 
substances (eg., lysolecithin, pluronic pclyols, polyanions, peptides, oil emxilsions, 
dinitrophenol, etc.), human adjuvants such mBacille Calmette-Guerin and Corynebacterium 
parvum, or similar immunostiraulatory agents. If desired, the an1|ibody molecules directed 
against NOVX can be isolated from the mammal (e.g., from the blood) and ftirther purified by 

1 5 well known techniques, such as protein A chromatography to obtain the IgG fraction- 

The terra "monoclonal antibody" or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one species of an antigen 
binding site capable of immunoreacting with a particular epitope of NOVX. A monoclonal 
antibody composition thus typically displays a single binding affinity for a particular NOVX 

20 protein with which it immunoreacts* For preparation of monoclonal antibodies directed 

towards a particular NOVX protein, or derivatives, fragments, analogs or homologs thereof, 
any technique that provides for the production of antibody molecules by continuous cell line 
culture may be utilized. Such techniques include, but are not limited to, the hybridoma 
technique (see, e.g., Kohler & Milstem, 1975, Nature 256: 495-497); the trioma technique; the 

25 human B-cell hybridoma technique (see, e.g., Kozbor, et al, 1983. Immunol Today 4; 72) and 
the EBV hybridoma technique to produce human monoclonal antibodies (see, e.g„ Cole, et al, 
1985* In: Monoclonal Anttbodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 
Human monoclonal antibodies may be utilized in the practice of the invention and may be 
produced by using human hybridomas {see, e.g., Cote, et al, 1983. Proc Natl Acad Sci USA 

30 80; 2026-2030) or by transforming human B-cells with Epstein Barr Virus in vitro (see, e.g., 
Cole, et al, 1985, In: Monoclonal ANTIBODIES AND Cancer Therapy, Alan R. Liss, Lie, 
pp. 77-96). Each of the above citations is incorporated herein by reference in their entirety. 

According to the invention, techniques can be adapted for the production of 
single-<;hain antibodies specific to an NOVX protein (see, e,g, U,S. Patent No. 4,946,778), In 
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addition, methods can be adapted for the construction of F^b expression libraries (see, e,g„ 
Huse, et al, 1989. Science 246: 1275-1281) to allow rapid and effective identification of 
monoclonal F^b fragments with the desired specificity for an NOVX protein or derivatives, 
fragments, analogs or homologs thereof Non-human antibodies can be "hmnanized" by 
5 techniques well known in the art. See, e.g., U.S. Patent No. 5,225,539. Antibody fragments 
that contain the idiotypes to an NOVX protein may be produced by techniques known in the 
art including, but not limited to; (i) an F(ab')2 fragment produced by pepsin digestion of an 
antibody molecule; (it) an Fab fragment generated by reducmg the disulfide bridges of an F(ftb')2 
fragment; (iii) an Fab fi^gment generated by the treatment of tiie antibody molecule with 

10 papain and a reducing agent; and (iv) Fv fragments. 

Additionally, recombinant anti-NOVX antibodies, such as chimeric and humanized 
monoclonal antibodies, comprismg both human and non-human portions, which can be made 
using standard recombinant DNA techniques, are within the sc<^e of the invention. Such 
chimeric and humanized monoclone antibodies can be produced by recombinant DNA 

1 5 techniques known in the art, for example using methods described in International Application 
No. PCT/US86/02269; European Patent Application No. 184,187; European Patent 
Application No. 171,496; European Patent Application No. 173,494; PCX International 
PubUcaiion No. WO 86/01533; U.S. Patent No. 4,816,567; U.S. Pat No. 5,225,539; European 
Patent Apphcation No. 125,023; Better, et al, 1988. Science 240: 104M043; Liu, et ai, 1987. 

20 Proc. Natl Acad. Sci. USA 84: 3439-3443; Liu, et al, 1987, Immunol 139: 3521-3526; Sun, 
et al, mi. Proc. Natl Acad, Set USA 84; 214-218; Nishimura, etal, 1987. Cancer Res. 47: 
9994005; Wood, et al, 1985. Nature 314 :446"449; Shaw, et al, 1988. Natl Cancer Inst, 
80: 1553-1559); Morrison(1985) Science 229:1202-1207; Oi, et al (1986) BioTechniques 
4:214; Jones, et al, 1986. Nature 321: 552-525; Verhoeyan, et al, 1988. Science 239: 1534; 

25 and Bcidler, et al, 1988. Immunol 141 : 4053-4060. Each of the above citations are 
incorporated herein by reference in their entirety. 

In one embodiment, methods for the screening of antibodies that possess the desired 
specificity include, but are not limited to, enzyme-linked immunosorbent assay (ELISA) and 
other immunologically-mediated techniques known within the art. In a specific embodiment, 

30 selection of antibodies that are specific to a particular domain of an NOVX protein is 
facilitated by generation of hybridomas that bind to the Augment of an NOVX protein 
possessing such a domain. Thus, antibodies that are specific for a desired domain within an 
NOVX protein, or derivatives, fragments, analogs or homologs thereof, arc also provided 
herein. 
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Anti-NOVX antibodies may be used in methods known within the art relating to the 
localization and/or quantitation of an NO VX protein (e.g„ for use in measuring levels of the 
NOVX protein within appropriate physiological samples, for use in diagnostic metbodSj for 
use in imaging the protein, and the like), In a given embodiment, antibodies for NOVX 
5 proteins, or derivatives, jQragments, analogs or homologs thereof, that contain the antibody 
derived binding domain, are utilized as phannacologically-active compounds (hereinafter 
"Therapeutics"). 

An anti-NOYX antibody (e.g., monoclonal antibody) can be used to isolate an NOVX 
polypeptide by standard techniques, such as affinity chromatography or inmaunoprecipitation. 

10 An anti-NOVX antibody can facilitate the purification of natural NOVX polypeptide from 
cells and of recombinantly-produced NOVX polypeptide expressed in host cells* Moreover, 
. an anti-NOVX antibody can be used to detect NOVX protein (eg., in a cellular lysate or cell 
supernatant) in order to evaluate the ab\mdance and pattem of e?^fession of the NOVX 
protein. Anti-NOVX antibodies can be used diagnostically to monitor protein levels in tissue 

1 5 as part of a clinical testing procedure, e,g. , to, for example, det^mine the efficacy of a given 
treatment regimen. Detection can be facilitated by coupling (i.e., physically linking) the 
antibody to a detectable substance. Examples of detectable substances include various 
aozymes, prosthetic groups, fluorescent materials, luminescent materials, bioluminescent 
materials, and radioactive materials. Examples of suitable en2ymes include horseradish 

20 peroxidase, alkaline phosphatase, p-galactosidase, or acetylcholinesterase; examples of 

sviitable prosthetic group complexes include streptavidin^iotin and avidin/biotin; examples of 
suitable fluorescent materials include umbelliferone, fluorescein, fluorescein isothiocyanate, 
rhodamine, dicMorotriazinylamine fluorescein, dansyl chloride orphycocrythrin; an example 
of a Imninescent material includes luminol; examples of bioluminescent materials include 

25 ludferase, luciferin, and aequorin, and examples of suitable radioactive material include ^^I, 
^^^I,^^Sor^H. 

NOVX Recombinant Expression Vectors and Host Cells 

Another aspect of the invention pertains to vectors, preferably expression vectors, 
30 containing a nucleic acid encoding an NOVX protein, or derivatives, fragments, analogs or 
homologs thereof. As used herein, the term "vector** refers to a nucleic acid molecule capable 
of transporting another nucleic acid to which it has been Unked. One type of vector is a 
"plasmid**, which refers to a circular double stranded DNA loop into which additional DNA 
segments can be ligated. Another type of vector is a viral vector> wherein additional DNA 
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segments can be ligated into the viral genome* Certain vectors are capable of autonomous 
replication in a host cell into which they are introduced (e.g. ^ bacterial vectors having a 
bacterial origin of replication and episomal mammalian vectors). Other vectors (e.g., 
non-episomal mammalian vectors) are integrated into flie genome of a host cesU upon 
5 introduction into tiie host cell, and thereby are replicated along with the host genome. 

Moreover, certain vectors are capable of directing the expression of genes to which they are 
operatively-linked, Such vectors are referred to herein as "expression vectors". In general, 
expression vectors of utility in recombinant DNA techniques are often in the form of plasmids. 
In the present specification, "plasmid" and "vector" can be used interchangeably as the 

10 plasmid is the most commonly used form of vector. However, the invention is intended to 

include such other forms of expression vectors, such as viral vectors (e.g., replication defective 
retroviruses, adeno\druses and adeno-associated viruses), which serve equivalent functions. 

The recombinant expression vectors of the invention comprise a nucleic acid of the 
invention in a form suitable for expression of the nucleic acid in a host cell, which means that 

15 the recombinant expression vectors include one or more regulatory sequences, selected on the 
basis of the host cells to be used for expression, that is operatively-linked to the nucleic acid 
sequence to be expressed. Within a recombinant expression vector, "operably-linked" is 
intended to mean that the nucleotide sequence of interest is linked to the regulatory 
sequence(s) in a manner that allows for expression of the nucleotide sequence (^.g,, in an in 

20 vitro transcription/translation system or in a host cell vfhm the vector is introduced into the 
host cell). . 

The term "regulatory sequence" is intended to includes promoters, enhancers and other 
expression control elements (e.g., polyadenylation signals). Such regulatory sequences are 
described, for example, in Goeddel, Gene EXPRESSION TECHNOLOGY: Methods in 

25 Enzymology 185, Academic Press, San Diego, Calif (1990), Regulatory sequences include 
tiiose that direct constitutive expression of a nucleotide sequence in many types of host cell 
and those that direct expression of the nucleotide sequence only in certain host cells (e.g., 
tissue-specific regulatory sequences). It will be appreciated by those skilled in the art that the 
design of the expression vector can depend on such factors as the choice of the host cell to be 

30 transformed, the level of expression of protein desired, etc, The expression vectors of the 

invention can be introduced into host cells to thereby produce proteins or peptides, including 
fusion proteias or peptides, encoded by nucleic acids as described herein {e.g., NOVX 
proteins, mutant forms of NOVX proteins, fusion proteins, etc.). 
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The recombinant expression vectors of the invention can be designed for expression of 
NOVX proteins in prokaryotic or eukaiyotic cells. For example^ NO vX proteins can be 
expressed in bacterial cells such as Escherichia coli, insect cells (using baculoyirus expression 
vectors) yeast cells or mammalian cells* Suitable host cells are discussed further in Goeddel, 
5 Gene Expression Technology: Methods in Enzymology 1 85, Acaderoic Press, San 

Diego, Calif. (1990). Alternatively, the recombinant expression vector can be transcribed and 
translated in vitro, for example using T7 promoter regulatory sequences and T7 polymerase. 

Expression of proteins in prokaryotes is most ofteani carried out in Escherichia coli with 
vectors containing constitutive or inducible promoters directing the expression of dthdr fiision 

1 0 or non-fiision proteins. Fusion vectors add a number of amino acids to a protein encoded 
therein, usually to the amino terminus of the recombinant protein. Such fusion vectors 
typically serve three purposes; (t) to increase expression of recombinant protein; (ii) to 
increase the solubility of the recombmant protein; and (Hi) to aid jiii the purification of the 
recombinant protein by acting as a Ugaud in affinity purification. Often, in fusion expression 

1 5 vectors, a proteolytic cleavage site is introduced at the junction of the fusion moiety and the 
recombinant protein to enable separation of the recombinant protein fium the fusion moiety 
subsequent to purification of flie fiision protein. Such enzymes, and their cognate recognition 
sequences, include Factor Xa, thrombin and enterokinase. Typical fixsion expression vectors 
include pGEX (Pharmacia Biotech Inc; Smith and Johnson, 1988. Gene 67: 31-40), pMAL 

20 (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) that fiise 
glutathione S-transferase (GST), maltose E binding ptxjtein, or protein A, respectively, to the 
target recombinant protein. 

Examples of suitable inducible non-fusion £. coli expression vectors include pTrc 
(Ainrann et al, (1988) Gene 69:301-3 15) and pET lid (Studier et al, Genb Expression 

25 Tecmmology: Methods in Enzymology 1 85, Academic Press, San Diego, Calif. (1990) 
60^89). 

One strategy to maximize recombinant protein expression in E, coli is to express the 
protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant 
protein. See, e.g., Gottesman, Gene Expression Technology: Methods in Enzymology 
30 185, Academic Press, San Diego, Calif. (1990) 1 19-128. Another strategy is to alter the 
nucleic acid sequence of ttie nucleic acid to be inserted into an expression vector so that the 
individual codons for each amino acid are those preferentially utilized in E, coli (see, e,g., 
Wada, et al, 1992. Nucl Acids Res. 20: 2111-2118). Such alteration of nucleic acid 
sequences of the invention can be carried out by standard DNA synthesis techniques. 
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In another embodiment, the NOVX expression vector is a yeast expression vector. 
Examples of vectors for expression in yeast Saccharomyces cerivisae include pYepSecl 
(B^dari, et al, 1987. EMBO 1 6: 229-234), pMFa (Knrjan and Herskowitz, 1982. Cell 30: 
933-943), pJRY88 (Schultz et al, 1987. Gene 54: 113-123), pYES2 (Invitrogen Corporation, 
5 San Diego, Calif.), and picZ (InVitrogen Corp, San Diego, Calif.). 

Alternatively, NOVX can be expressed in insect cells using baculovirus e:j^ression 
vectors. Baculovirus vectors available for expression of proteins in cultured insect cells {e.g., 
SF9 cells) include the pAc series (Smith, et al, 1983. Mol Cell Biol 3: 2156-2165) and the 
pVL series (Lucklow and Summers, 1989. Virology 170: 31-39). 

10 In yet another embodiment, a nucleic acid of the invention is expressed in mammalian 

cells using a mammalian expression vector. Examples of mammalian expression vectors 
include pCDM8 (Seed, 1987, Nature 329: 840) and pMT2PC (Kaufinan, et al, 1987. EMBO 
J.6\l 87-195). When used in mammalian cells, the expression v^^ctor's control functions arc 
often provided by viral regulatory elements. For example, commonly used promoters are 

15 derived from polyoma, adenovirus 2, cytomegalovirus, and simian virus 40. For other suitable 
expression systems for both prokaryotic and eukaryotic ceUs see, e,g. Chapters 16 and 17 of 
Sambrook, et al, MOLECULAR CLONING: A LABORATORY MANUAL. 2nd ed. Cold Spring 
Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989. 
In another embodiment, the recombinant mammalian expression vector is capable of 

20 directing expression of the nucleic acid preferentially in a particular cell type (e.g., 

tissue-specific regulatory elements are used to express the nucleic acid). Tissue-specific 
regulatory elements are known in the art. Non-limiting examples of suitable tissue-specific 
promoters include the albumm promoter (liver-specific; Pinkert, et aL^ 1 987, Oenes Dev. 1 : 
268-277), lymphoid-specific promoters (Calame and Eaton, 1988, Adv. Immunol 43: 

25 235-275), in palicular promoters of T cell receptors (Winoto and Baltimore, 1989, EA^BO J. 
8: 729-733) and immunoglobulins (Banerji, et al, 1983. Cell 33: 729-740; Queen and 
Baltimoie, 1983. Cell 33: 741-748), neuron-specific promoters (er.g., the neurofilament 
promoter; Byrne and Ruddle, 1989, Proa Natl Atad. Sd. USA 86: 5473-5477), 
pancreas-specific promoters (Edlund, et al, 1985. Science 230: 912-916), and manmnary 

30 gland-specific promoters (e.g. , milk whey promoter, U.S. Pat. No. 4,873,3 1 6 and European 
Application Publication No. 264,166). Developmentally-regulated promoters arc also 
encompassed, e.g., the murine box promoters (Kessel and Gruss, 1990. Science 249: 374-379) 
and the a-fetoprotein promoter (Campes and Tilgjunan, 1989. Genes Dev. 3: 537-546). 
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The invention further provides a recombinant expression vector comprising a DNA 
molecule of the invention cloned into the expression vector in an antisense orientatioti. Tiiat 
is, the DNA molecule is operatively-linked to a regulatory sequence in a manner that allows 
for expression (by transcription of the DNA molecule) of an RNA molecule that is antisense to 
5 NOVX mRNA. Regulatory sequences operatively linked to a nucleic acid cloned in the 

antisense orientation can be chosen that direct the continuous expression of the antisense RNA 
molecule in a variety of cell types, for instance viral promoters and/or enhancers, or regulatory 
sequences can be chosen that direct constitutive, tissue specific or cell type specific expression 
of antisense RNA. The antisense expression vector can be in the form of a recombinaht 

10 plasmid, phagemid or attenuated virus in which antisense nucleic acids are produced under the 
control of a high efficiency regulatory region, the activity of which can be determined by the 
cell type into which the vector is introduced. For a discussion of the regulation of gene 
expression using antisense genes see, e.g., Weintraub, et alf "Ajqjtiscnse RNA as a molecular 
tool for genetic analysis," Reviews-Trends in Genetics^ Vol. 1(1) 1986* 

1 5 Another aspect of the invention pertains to host cells into which a recombinant 

expression vector of the invention has been introduced. The terms "host ceU** and 
"recombinant host cell" are used interchangeably herein. It is understood that such terras refer 
not only to the particular subject cell but also to the progeny or potential progeny of such a 
cell. Because certain modifications may occur in succeeding generations due to either 

20 mutation or environmental influences, such progeny may not, in fact, be identical to the parent 
cell, but are still included within the scope of the term as used herein. 

A host cell can be any prokaryotic or eukaiyotic cell* For example, NOVX protein can 
be expressed in bacterial cells such as coli^ insect cells, yeast or mammalian cells (such as 
Chinese hamster ovary cells (CHO) or COS cells). Other suitable host cells are known to 

25 those skilled in the art. 

Vector DNA can be introduced into prokaryotic or eukaryotic cells via conventional 
transformation or transfection techniques. As used herein, the trans "transformation" and 
"transfection" are intended to refer to a variety of art-recognized techniques for introducing 
foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium 

30 chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or 

electroporation. Suitable methods for transforming or transfecting host cells can be found in 
Sambrook, et al (Molecular Cloning: A Laboratory Manual. 2nd ed,, Cold Spring 
Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989), 
and other laboratory manuals. 
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For stable tiansfection of manunalian cells, it is known that, depending upon the 
expression vector and transfection technique used, only a small fraction of cells may integrate 
the foreign DNA into their genome. In order to identify and select these integrants, a gene that 
encodes a selectable marker (e.g., resistance to antibiotics) is generally introduced into the 
5 host cells along with the gene of interest. Various selectable markers mclude those that confer 
resistance to drugs, such as G418, hygromycin and methotrexate. Nucleic acid encoding a 
selectable marker can be introduced into a host cell on the same vector as that encoding 
NO VX or can be introduced on a separate vector. Cells stably transfected with the introduced 
nucleic acid can be identified by drug selection (e.g., cells that have incorporated the 

1 0 selectable marker gene will survive, while the other cells die), 

A host cell of the invention, such as a prokaryotic or eukaryotic host cell in culture, can 
be used to produce (ie., express) NOVX protein. Accordingly, the invention further provides 
methods for producing NOVX protein using the host cells of th^ invention. In one 
embodiment, the method comprises culturing the host cell of invention (into which a 

1 5 recombinant expression vector encoding NOVX protein has been introduced) in a suitable 
medium such that NOVX protein is produced In another embodiment, the method further 
comprises isolating NOVX protein from the medium or the host cell 

Transgenic NOVX Animals 

The host cells of the invention can also be used to produce non-human transgenic 
20 animals. For example, in one embodiment, a host cell of the invention is a fertilized oocyte or 
an embryonic stem cell into which NOVX protein-coding sequences have been introduced. 
Such host cells can then be used to create non-human transgenic animals in which exogenous 
NOVX sequences have been introduced into Iheir genome or homologous recombinant 
animals in which endogenous NOVX sequences have been altered Such animals are useful 
25 for studying the function and/or activity of NOVX protein and for identifying and/or 

evaluating modulators of NOVX protein activity. As used herein, a "transgenic animal" is a 
non-human anunal^ preferably a mammal, more preferably a rodent such as a rat or mouse, in 
which one or more of the cells of the animal includes a transgene. Other examples of 
transgenic animals include non-human primates, sheep, dogs, cows, goats, chickens, 
30 amphibians, etc. A transgene is exogenous DNA that is integrated into the genome of a cell 
from which a transgenic animal develops and that remains in the genome of the mature 
animal, thereby directing the expression of an aicoded gene product in one or more ceU types 
or tissues of the transgenic animal. As used herein, a '^homologous recombinant animal" is a 
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non-human animal, preferably a mammal, more preferably a mouse, in which an endogenous 
NOVX gene has been altered by homologous recombination between the mdogenous gene 
and an exogenous DNA molecule introduced into a cell of the animal, e.g. , an embryonic cell 
of the animal, prior to development of the animal. 
5 A transgenic animal of the invention can be created by introducing NOVX-encoding 

nucleic acid into the male pronuclei of a fertilized oocyte (e.g., by microinjection, retroviral 
infection) and allowing the oocyte to develop in a pseudopregnant female foster animal. The 
human NOVX cDNA sequences SEQIDNOS:!, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, and 25 
can be introduced as a transgene into the genome of a non-human animal* Alternatively, a 

1 0 non-human homologue of the human NOVX gene, such as a mouse NOVX gene, can be 

isolated based on hybridization to the human NOVX cDNA (described further stq?ra) and used 
as a transgene* Intronic sequences and polyadenylation signals can also be included in the 
transgene to increase the efficiency of expression of the transgen^. A tissue-specific 
regulatory sequence(s) can be operably-linked to the NOVX transgene to direct expression of 

1 5 NOVX protein to particular cells. Methods for generating transgenic animals via embryo 
manipulation and micromjection, particularly animals such as mice, have become 
conventional in the art and are described, for example, in U.S. Patent Nos, 4,736,866; 
4,870,009; and 4,873,191; andHogan, 1986. In; Manipulating THE Mouse Embryo, Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y, Similar methods are used for 

20 production of other transgenic animals, A transgenic founder animal can be identified based 
upon the presence of the NOVX transgene in its genome and/or expression of NOVX roKNi^ 
in tissues or cells of the animals. A transgenic founder animal can then be used to breed 
additional animals carrying the transgene. Moreover, transgenic animals carrying a transgene- 
encoding NOVX protein can further be bred to other transgenic animals carrying other 

25 transgenes. 

To create a homologous recombioant animal, a vector is prepared which contains at 
least a portion of an NOVX gene into which a deletion, addition or substitution has been 
introduced to thereby alter, e.g., functionally disrupt, the NOVX gene. The NOVX gene can 
be a human gene (e.g, the cDNA of SEQ ID N0S:1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, and 
30 25), but more preferably, is a non-htmian homologue of a human NOVX gene. For example, a 
mouse homologue of human NOVX gene of SEQ ID N0S:1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 
23, and 25 can be used to construct a homologous recombination vector suitable for altering an 
endogenous NOVX gene in the mouse genome. In one embodiment, the vector is designed 
such that, upon homologous recombination, the endogenous NOVX gene is functionaUy 
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disrupted {le,, no longer encodes a fimctional protein; also referred to as a "knock out" 
vector). 

Alternatively, the vector can be designed such that, upon homologous recombination, 
the endogenous NOVX gene is mutated or otherwise altered but still encodes functional 

5 protein (e.g., the upstream regulatory region can be altered to thereby alter the expression of 
the endogenous NOVX protein)* In the homologous recombination vector, tiie altered portion 
of the NOVX gene is flanked at its 5*- and 3'4ermini by additional nucleic acid of the NOVX 
gene to allow for homologous recombination to occur between the exogenous NOVX gene 
carried by the vector and an endogenous NOVX gene in an embryonic stem cell. The 

1 0 additional flanking NOVX nucleic acid is of sufl&cient length for successftd homologous 

recombination with the endogenous gene. Typically, several kilobases of flanking DNA (both 
at the 5 - and 3*-termim) are included in the vector. See, eg., Thomas, et al, 1987. Cell 5 1 : 
503 for a description of homologous recombination vectors. Thq vector is ten introduced into 
an embryonic stem cell line (eg., by electroporation) and cells in which the introduced NOVX 

15 gone has homologously-recombined wiHi the endogenous NOVX gene are selected. See, e,g., 
Li, e/ a/., 1992. Ce// 69: 915, 

The selected cells are then injected into a blastocyst of an animal (e.g,, a mouse) to 
form aggregation chimeras. See, eg., Bradley, 1987. In: Tbratocarcinomas AND 
Embryonic Stem Cells: A Practical Approach, Robertson, ed. IRL, Oxford, pp. 1 13-152. 

20 A chimeric embryo can then be implanted into a suitable pseudopregnant female foster animal 
and the embryo brought to term* Progeny harboring the homoldgously-recombined DNA in 
their germ cells can be used to breed animals in which all cells of the animal contain the 
homologously-recombined DNA by germline transmission of the transgene. Methods for 
constructing homologous recombination vectors and homologous recombinant animals are 

25 described further in Bradley, 1991 . Curr. Opin, Biotechnol 2; 823-829; PCX International 
Publication Nos.: WO 90/11354; WO 91/01140; WO 92/0968; and WO 93/04169. 

In another embodiment, transgenic non-humans animals can be produced that contain 
selected systems that allow for regulated expression of the transgene. One example of such a 
system is the cre/loxP recombinase system of bacteriophage PI . For a description of the 

30 crc/loxP recombinase system. See, e.g., Lakso, et ah, 1992, Proc, Natl Acad. ScL USA 89: 
6232-6236, Another example of a recombinase system is the ELP recombinase system of 
Saccharomyces cerevisiae. ^ee, O'Gorman, a/., 1991. Scfe/ice 251:1351-1355. Ifacre/loxP 
recombinase system is used to regulate expression of the transgene, animals containing 
transgenes encoding both the Cre recombinase and a selected protein are required. Such 
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aniimals can be provided through the construction of "double" transgenic animals, e.g., by 
mating two transgenic animals, one containing a transgene encoding a selected protein and the 
other contaming a transgene encoding a recombinase. 

Clones of the non-human transgenic animals described herein can ^so be produced 
according to the methods described in Wihnut, et aL, 1997, Nature 385: 810-813. In brief, a 
cell {e,g., a somatic cell) from the transgenic animal can be isolated and induced to exit the 
growth cycle and enter Go phase. The quiesc^t cell can then be fused, e.g, , through the use of 
electrical pulses, to an enucleated oocyte from an animal of the same species from which the 
quiescent cell is isolated. The reconstructed oocyte is then cultured such that it develotJS to 
morula or blastocyte and then transferred to pseudopregnant female foster animal. The 
offepring borne of this female foster animal will be a clone of the animal from which the cell 
(eg., tiie somatic cell) is isolated. 

Pharmaceutical Compositioiis 

The NOVX nucleic acid molecules, NOVX proteins, and anti-NO VX antibodies (also 
referred to herein as "active compounds") of the invention, and derivatives, fragments, analogs 
and homologs thereof, can be incorporated into pharmaceutical compositions suitable for 
administration. Such compositions typically comprise the nucleic acid molecule, proteia, or 
antibody and apharmaceutically acceptable carrier. As used herein, "phaitnaceuticaJJy 
acceptable carrier" is intended to include any and all solvents, dispersion media, coatings, 
antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like, 
compatible with phaimaceutical administratioa Suitable carriers are described in the most 
recent edition of Remington's Phannaceutical Sciences, a standard reference text in the field, 
which is incorporated herein by reference. Preferred examples of such carriers or diluents 
include, but are not limited to, water, saline, finger's solutions, dextrose solution, and 5% 
human serum albumin. Liposomes and non-aqueous vehicles such as fixed oils may also be 
used. The use of such media and agents for pharmaceutically active substances is well known 
in the art. Except insofar as any conventional media or agent is incompatible with the active 
compound, use thereof in the compositions is contemplated. Supplementary active 
compounds can also be incorporated into the compositions, 

A pharmaceutical composition of the invention is formulated to be compatible with its 
intended route of administration. Examples of routes of administration include parenteral, 
e.g., intravenous, intradermal, subcutaneous, oral (eg., inhalation), transdermal {ie„ topical), 
transmucosal, and rectal administration. Solutions or suspensions used for parenteral, 
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iatradermal, or subcutaneous ^plication can include the following components: a sterile 
diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, glyQerinej 
propylene glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or 
methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such 
5 as ethylenediaminetetraacetic acid (EDTA); buffers such as acetates, citrates or phosphates, 
and agents for the adjustment of tonicity such as sodium chloride or dextrose. The pH can be 
adjusted with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral 
preparation can be enclosed in ampoules, disposable syringes or multiple dose vials made of 
glass or plastic, 

1 0 Pharmaceutical compositions suitable for injectable use include sterile aqueous 

solutions (where water soluble) or dispersions and sterile powders for the extenqwraneous 
preparation of sterile injectable solutions or dispersion. For intravenous administration, 
suitable carriers include physiological saline, bacteriostatic wat€(r, Cremophor EL"^ (BASF, 
Parsippany, N J.) or phosphate buffered saline (PBS)- In all cases, the composition must be 

1 5 sterile and should be fluid to the extent that easy syringeability exists. It must be stable under 
the conditions of manufacture and storage and must be preserved against the contaminating 
action of microorganisms such as bacteria and fungi. The carrier can be a solvent or 
dispersion medium containing, for example, water, ethanol, polyol (for example, glycerol, 
propylene glycol, and liquid polyethylene glycol, and the like), and suitable mixtures thereof 

20 The proper fluidity can be maintained, for example, by the use of a coating such as lecithin, by 
the maintenance of the required particle size in the case of dispersion and by the use of 
surfactants. Prevention of the action of microorganisms can be achieved by various 
antibacterial and antifungal agents, for example, parabens, chlorobutanol, phenol, ascorbic 
acid, thimerosal, and the like. In many cases, it will be preferable to include isotonic agents, 

25 for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in the 

composition. Prolonged absorption of the injectable compositions can be brought about by 
including in the composition an agent which delays absorption, for example, aluminmn 
monostearate and gelatin. 

Sterile injectable solutions can be prepared by incorporating the active compound {eg,, 

30 an NOVX protein or anti-NOVX antibody) in the required amount in an appropriate solvent 
with one or a combmation of ingredients enumerated above, as required, followed by filtered 
sterilization. Generally, dispersions are prepared by incorporating the active compound into a 
sterile vehicle that contains a basic dispersion medium and the required other ingredients fi*om 
those enumerated above. In the case of sterile powders for the preparation of sterile injectable 
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solutions, methods of preparation are vacuum drying and freeze-drjong that yields a powder of 
the active ingredient plus mty additional desired ingredient from a previously sterile-filtered 
solution thereof. 

Oral compositions generally include an inert diluent or an edible carrier. They can be 
enclosed in gelatin capsules or compressed into tablets. For the purpose of oral therapeutic 
administration, the active compound can be incorporated with excipients and used in the form 
of tablets, troches, or capsules. Oral compositions can also be prepared using a fluid carrier 
for use as a mouthwash, wherein the compound in the fluid carrier is applied orally and 
swished and expectorated or swallowed, Pharmaceutically compatible binding agents, and/or 
adjuvant materials can be included as part of the composition. The tablets, pills, capsules, 
troches and the like can contain any of the following ingredients, or compounds of a similar 
nature: a binder such as microcrystalline cellulose, gum tragacanth or gelatin; an excipient 
such as starch or lactose, a disintegrating agent such as alginic ajcjd, Primogel, or com starch; a 
lubricant such as magnesium stearate or Sterotes; a glidant such as colloidal silicon dioxide; a 
sweetening agent such as sucrose or saccharin; or a flavoring agent such as peppermint, 
methyl salicylate, or orange flavoring. 

For administration by inhalation, the compounds are delivered in the form of an 
aerosol spray from pressured container or dispenser which contains a suitable propellant, e.g., 
a gas such as carbon dioxide, or a nebulizer. 

Systemic administration can also be by transmucosal or transdermal means. For 
transmucosal or transdermal administration, penetrants appropriate to the barrier to be 
permeated are used in (he formulation* Such penetrants are gener^y known in the art, and 
include, for example, for transmucosal administration, detergents, bile salts, and fiisidic acid 
derivatives, Transmucosal administration can be accomplished through the use of nasal sprays 
or suppositories. For transdermal administration, the active compounds are formulated into 
ointments, salves, gels, or creams as generally known in the art. 

The compounds can also be prepared in the form of suppositories (e.g., with 
conventional suppository bases such as cocoa butter and other glycerides) or retention enemas 
for rectal delivery. 

In one embodiment, the active compounds are prepared with carriers that will protect 
the compound against r^id elimination from the body, such as a controlled release 
formulation, including implants and microencapsulated delivery systems. Biodegradable, 
biocompatible polymers can be used, such as ethylene vinyl acetate, polyanhydrides, 
polyglycolic acid, collagen, polyorthoesters, andpolylactic acid. Methods for preparation of 
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such formulations will be apparent to those skilled in the art. The materials can also be 
obtained commercially Jfrom Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal 
suspensions (including liposomes targeted to infected cells with monoclonal antibodies to viral 
antig^) can also be used as pharmaceutically acceptable carriers. These can be prepared 
according to methods known to those skilled in the art, for example, as described in US. 
Patent No. 4,522,81 L 

It is especially advantageous to formulate oral or parenteral compositions in dosage 
unit fonn for ease of administration and uniformity of dosage* Dosage unit form as used 
herein refers to physically discrete units suited as unitary dosages for the subject to be treated; 
each unit containing a predetennined quantity of active compound calculated to produce the 
desired therapeutic effect in association with the required pharmaceutical carrier* The 
specification for the dosage unit forms of the invention are dictated by and directly deperideaat 
on the unique characteristics of the active compound and the particular therapeutic effect to be 
achieved, and the limitations inherent in the art of compounding such an active compound for 
the treatment of individuals. 

The nucleic acid molecules of the invention can be inserted into vectors and used as 
g^e ihotdpy vectors. Gene therapy vectors can be delivered to a subj ect by, for example, 
intravenous injection, local administration (see, e.g., U.S. Pateait No. 5,328,470) or by 
stereotactic injection (^ee, e.g., Chen, etaL, 1994. Proc. Natl Acad. Set USA 91: 3054-3057). 
The pharmaceutical preparation of the gene therapy vector can include the gene therapy vector 
in an acceptable diluent, or can comprise a slow release matrix in which the g^e delivery 
vehicle is imbedded. Alternatively, where the complete gene delivery vector can be produced 
intact from recombinant cells, e.g, retroviral vectors, the pharmaceutical preparation can 
include one or more cells that produce the gene delivery system* 

The pharmaceutical compositions can be included in a container, pack, or dispenser 
together with instructions for administration. 

Screening and Detection Methods 

The isolated nucleic acid molecules of the invention can be used to express NOVX 
protein (e.g., via a recombinant expression vector in a host cell in gene therapy applications), 
to detect NOVX mRNA (e.g., in a biological sample) or a genetic lesion in an NOVX gene, 
and to modulate NOVX activity, as described further, below. In addition, the NOVX protems 
can be used to screen drugs or compounds that modulate the NOVX protein activity or 
expression as well as to treat disorders characterized by insufficient or excessive production of 
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NOVX protein or production of NOVX protein forms that have decreased or aberrant activity 
compared to NOVX wild-type protein (e.g. ; diabetes (regulates insulin release); obesity (binds 
and transport lipids); metabolic disturbances associated with obesity, the metabolic syndrome 
X as well as anorexia aad wasting disorders associated with chronic diseases and various 
5 cancers, and infectious disease(possesses anti-microbial activity) and the various 

dyslipidemias. In addition, the anti-NOVX antibodies of the invention can be used to detect 
and isolate NOVX proteins and modulate NOVX activity. In yet a further aspect, the inv^tion 
can be used in methods to influence appetite, absorption of nutrients and the disposition of 
metabolic substrates in both a positive and negative fashion. 
10 The invention further pertains to novel agents id^tifiied by the screening assays 

described herein and uses thereof for treatments as described, supra. 

Screening Assays ( 

The invention provides a method (also referred to herein as a "screening assay") for 

15 identifying modulators, i.e., candidate or test compoimds or agents (e.g., peptides, 

peptidomimetics, small molecules or other drugs) that bind to NOVX proteins or have a 
stimulatory or inhibitory effect on, e.g., NOVX protein expression or NOVX protein activity. 
The invention also includes compounds identified in the screening assays described herein. 
In one embodiment, the invention provides assays for screening candidate or test 

20 compounds which bind to or modulate the activity of the membrane-bound form of an NOVX 
protein or polypeptide or biologically-active portion thereof. The test compounds of the 
invention can be obtained using any of the nimierous ^proaches in combinatorial library 
mettiods known in the art, including: biological libraries; spatially addressable parallel solid 
phase or solution phase libraries; synthetic library methods requiring deconvolution; the 

25 "one-bead one-compound" library method; and synthetic library methods using affinity 
chromatography selection. The biological tibrary approach is limited to peptide libraries, 
while the other four approaches are applicable to peptide, non-peptide oUgomer or small 
molecule libraries of compounds. See, e.g.. Lam, 1991 , Anticancer Drug Design 12: 145. 
A "small molecule" as used herein, is meant to refer to a composition that has a 

30 molecular weight of less than about 5 kD and most preferably less than about 4 kD. Small 
molecules can be, e.g, nucleic acids, peptides, polypeptides, peptidomimetics, carbohydrates, 
Upids or other organic or inorganic molecules. Libraries of chemical and/or biological 
mixtures, such as fungal, bacterial, or algal extracts, are known in the art mi can be screened 
with any of the assays of the invention. 
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Examples of methods for the synthesis of molecular libraries can be found in the art, 
ior example in: DeWitt, etal, 1993. Proc. Natl Acad. Sci, USA. 90: 6909; Erb, et al, 1994, 
Froc. Natl Acad. Set US.A. 91: 11422; Zuckermann, etal, 1994. J. Med Chem, 37: 2678; 
Cho, et al, 1993. Science 261; 1303; Carrell, et al, 1994. Angew. Chem, Int. Ed. Engl 33: 
2059; Carell, et ah, 1994. Angew. Chem. Int Ed Engl 33: 2061 ; and Gallop, et al, 1994. 1 
Med Chem. 37: 1233. 

Libraries of compounds may be presented in solution (eg,, Houghten, 1992. 
Biotechniques 13: 412-421), or on beads (Lam, 1991. Nature 354: 82-84), on chips (Fodor, 
1993. Nature 364: 555-556), bacteria (Ladner, U.S. Patent No. 5,223,409), spores (L^dner, 
U.S. Patent 5,233,409), plasmids (Cull, et al, 1992. Proc. Natl Acad Sci. USA 89: 
1865-1869) or on phage (Scott and Smith, 1990. Science 249: 386-390; Devlin, 1990. Science 
249: 404-406; Cwirla, et a/., 1990. Proc. Natl Acad, Sci USA. 87: 6378-6382; Felici, 1991. 
1 Mol Biol 222: 301-310; Ladner, U.S. Patent No. 5,233,409.^ 

In one embodiment, an assay is a cell-based assay in which a cell which expresses a 
membrane-bound form of NOVX protein, or a biologically-active portion thereof, on tiie cell 
surface is contacted with a test compound and the ability of the test compound to bind to an 
NOVX protein determined. The ceU, for example, can of mammalian origin or a yeast cell. 
Determining the ability of the test compound to bind to the NOVX protein can be 
accomplished, for example, by coupling the test compound with a radioisotope or enzymatic 
label such that binding of the test compound to the NOVX protein or biologically-active 
portion thereof can be determined by detecting the labeled compound in a complex. For 
example, test compounds can be labeled with ^^^I, ^^S, or ^H, either directly or indirectly, 
and the mdioisotope detected by direct counting of radioemission or by scintillation counting. 
Alternatively, test compounds can be enzymatically-labeled with, for example, horseradish 
peroxidase, alkaline phosphatase, or luciferase, and the enzymatic label detected by 
determination of conversion of an appropriate substrate to product. In one embodiment, ttie 
assay comprises contacting a cell which expresses a membrane-bound form of NOVX protein, 
or a biologically-active portion thereof on the cell surface with a known compound which 
binds NOVX to form an assay mixture, contacting the assay mixture with a test compound, 
and determining the ability of the test compound to interact with an NO VX protein, wherein 
determining the ability of the test compound to int^act with an NOVX protein comprises 
determining the ability of the test compound to preferentially bind to NOVX protein or a 
biologically-active portion thereof as compared to the known compound. 



162 



wo 01/74851 PCT/USOl/10039 
In another e^nbodirae^t, an assay is a cell-based assay comprising contacting a cell 
expressing a membraae-boimd form of NOVX protein, or a biologically-active poriioii thereof, 
on the cell surface with a test compound and determining the ability of the test compound to 
modulate (e.g., stimulate or inhibit) the activity of the NOVX protein or biologically-active 
portion thereof Determining the ability of the test compound to modulate the activity of 
NOVX or a biologically-active portion thereof can be accomplished, for example, by 
detemaining the ability of tiie NOVX protem to bind to or interact with an NOVX target 
molecule. As used herein, a "target molecule" is a molecule with which an NOVX protein 
binds or interacts in nature, for example, a molecule on the surface of a cell which expresses 
an NOVX interacting protein, a molecule on the surface of a second cell, a molecule in the 
extracellular milieu, a molecule associated with the internal surface of a cell membrane or a 
cytoplasmic molecule. An NOVX target molecule can be a non-NOVX molecule or an 
NOVX protein or polypeptide of the invention. In one embodim^t, an NOVX target 
molecule is a component of a signal transduction pathway that facihtates transduction of an 
extracellular signal (e.g. a signal generated by binding of a compound to a membrane-bound 
NOVX molecule) through the cell membrane and into the cell. The target, for example, can be 
a second intercellular protein that has catalytic activity or a protein that feciUtates the 
association of downstream signaling molecules with NOVX, 

Determiiiing the ability of the NOVX protein to bind to or interact with an NOVX 
target molecule can be accomplished by one of the methods described above for determining 
direct bindmg. In one embodiment, detemaining the ability of the NOVX protein to bind to or 
interact with an NOVX target molecule can be accomplished by determining the activity of the 
target molecule. For example, the activity of the target molecule can be determined by 
detecting induction of a cellular second messenger of the target (te. mtracellular Ca^"*", 
diacylglycerol, IP3, etc.), detecting catalytic/enzymatic activity of the target an ^propriate 
substrate, detecting the induction of a reporter gene (comprising an NOVX-responsive 
regulatory element operatively liidced to a nucleic acid encoding a detectable marker, e.g,^ 
luciferase), or detectmg a cellular response, for example, cell survival, cellular differentiation, 
or cell proliferation. 

In yet another embodiment, an assay of the invention is a cell-free assay comprising 
contactmg an NOVX protein or biologically-active portion thereof with a test compound and 
determinmg the ability of the test compound to bind to the NOVX protein or biologically- 
active portion tihereof Binding of the test compound to the NOVX protein can be determined 
either directly or indirectly as described above. In one such embodiment, the assay comprises 
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contacting the NOVX protein or biologically-active portion thereof with a knovm compound 
which binds NOVX to form an assay mixture, contacting the assay mixture with a test 
compound, and determining the ability of the test compound to interact with an NOVX 
protein, wherem deteimining the ability of the test compound to interact with an NOVX 
protein comprises determining the ability of the test compound to preferentially bind to NOVX 
or biologically-active portion thereof as compared to the known compound, 

In still anothCT embodiment, an assay is a cell-free assay comprising contacting NOVX 
protem or biologically-active portion thereof with a test compound and determining the ability 
of the test compound to modulate (e.g. stimulate or mhibit) the activity of the NOVX ^Hotein 
or biologicaliy-active portion thereof Determining the ability of the test compound to 
modulate the activity of NOVX can be accomplished, for example, by determining the ability 
of the NOVX protein to bind to an NOVX target molecule by one of the methods described 
above for determining dhrect bindmg. In an alternative embodim(ent, determining the ability of 
the test compound to modulate the activity of NOVX protein can be accomplished by 
determining the ability of the NOVX protein further modulate an NOVX target molecule. For 
example, the catalytic/enzymatic activity of the target molecule on an appropriate substrate 
can be determined as described, stq?ra. 

In yet another embodunent, the cell-free assay comprises contacting the NOVX protein 
or biologically-active portion thereof with a known compound which binds NOVX protein to 
foim an assay mixture, contacting the assay mixture witii a test compound, and detemiining 
the ability of the test compound to interact with an NOVX protein, wherein determining the 
ability of the test compound to interact with an NOVX protein comprises determining the 
ability of the NOVX protem to preferentially bind to or modulate the activity of an NOVX 
target molecule. 

The cell-free assays of the invention are amenable to use of both the soluble form or 
the membrane-bound form of NOVX protein. In the case of cell-free assays comprising the 
membrane-bound form of NOVX protein, it may be desfrable to utiU2;e a solubihzing agent 
such that the membrane-bound form of NOVX protein is maintamed in solution. Examples of 
such solubihzing agents include non-ionic detergents such as n-octylglucoside, 
n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucaniide, 
decanoyl-N-meaiylglucamide, Triton® X-100, Triton® X-1 14, Thesit®, 
Isotridecypoly(ethylene glycol etherX, N-dodecyl-N,N-dimethyl-3-ammomo-l -propane 
sulfonate, 3-(3-cholamidopropyl) dimethylammmiol-1 -propane sulfonate (CHAPS), or 
3-(3-cholamidopropyl)dimethylamminiol-2-hydroxy-l-propane sulfonate (CHAPSO). 
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In more than one ^bodiment of the above assay methods of the invention, it may be 
desirable to immobilize either NOVX protein or its target molecule to facilitate separation of 
complexed from micomplexed forms of one or both of the proteins, as well as to accommodate 
automation of the assay. Binding of a test compound to NOVX protein, or interaction of 
NOVX protein with a target molecule in the presence and absence of a candidate compound, 
can be accomplished in any vessel suitable for containing the reactants. Examples of such 
vessels include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a 
fusion protein can be provided that adds a domain that allows one or both of the proteins to be 
bound to a matrix. For example, GST-NOVX fusion proteins or GST-target fusion prbteins 
can be adsorbed onto glutattdone sepharose beads (Sigma Chemical, St. Louis, MO) or 
glutathione derivatized microtiter plates, that are then combined with the test compound or the 
test compound and either the non-adsorbed taiget protein or NOVX protein, and the mixture is 
incubated under conditions conducive to complex fonnation {e.g^ at physiological conditions 
for salt and pH), Following incubation, the beads or microtiter plate wells are washed to 
remove any unbound components, the matrix immobilized in the case of beads, complex 
determined either directly or indirectly, for example, as described, supra. Alternatively, the 
complexes can be dissociated from the matrix, and the level of NOVX protein binding or 
activity determined using standard techniques. 

Other techniques for immobilizing proteins on matrices can also be used in the 
screening assays of the invention. For example, either flie NOVX protein or its target 
molecule can be immobilized utilizing conjugation of biotin and streptavidin. Biotinylated 
NOVX protein or target molecules can be prepared from biotin-NHS 
(N-hy(to)xy-succinimide) using techniques well-known within the art (e.^., biotinylation kit, 
Pierce Chemicals, Rockford, III), and immobiUzed in the weUs of streptavidin-coated 96 well 
plates (Pierce Chemical). Alternatively, antibodies reactive with NOVX protein or target 
molecules, but which do not interfere with binding of the NOVX protein to its target molecule, 
can be derivatized to the wells of the plate, and unbound target or NOVX protein trapped in 
the wells by antibody conjugation. Methods for detecting such complexes, in addition to those 
described above for the GST-immobilized complexes, include inmiunodetection of complexes 
using antibodies reactive with the NOVX protein or target molecule, as well as enzyme-linked 
assays that rely on detecting an enzymatic activity associated with the NOVX protein or target 
molecule. 

In another embodiment, modulators of NOVX protein expression are identified in a 
method wherein a cell is contacted with a candidate compound and the expression of NOVX 
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mRNA or protein in the cell is determined. The level of expression of NOVX mRNA or 
protein in the presence of the candidate compound is compared to the level of expression of 
NOVX mKNA or protein in the absence of the candidate compoxmd. The candidate 
compound can then be identified as a modulator of NOVX mRNA or protein expression based 
upon this comparison* For example, when expression of NOVX mRNA or protein is greater 
{le., statistically significantly greater) in the presence of the candidate compound than in its 
absence, the candidate compound is identified as a stimulator of NOVX mRNA or protein 
expression. Alternatively, when expression of NOVX mRNA or protem is less (statistically 
significantly less) in the presence of the candidate compound than in its absence, the candidate 
compound is identified as an inhibitor of NOVX mRNA or protein expression. The level of 
NOVX mRNA or protein expression in the cells can be determined by methods described 
herein for detecting NOVX mRNA or protein. 

In yet another aspect of the mvention, the NOVX protein| can be used as '"bait 
proteins" in a two-hybrid assay or three hybrid assay {see, e.g., U.S. Patent No. 5,283,317; 
Zervos, et al, 1993. Cell 72: 223-232; Madura, et al, 1993. 1 Biol Chem. 268: 1204642054; 
Bartel, etal, 1993. Biotechniques 14: 920-924; Iwabuchi, etal, 1993. Oncogene 8: 
1693-1696; and Brent WO 94/10300), to identify other proteins that bind to or interact with 
NOVX ("NOVX-binding proteins" or "NOVX-bp") and modulate NOVX activity. Such 
NOVX-binding proteins are also likely to be involved in the propagation of signals by the 
NOVX proteins as, for example, upstream or downstream elements of the NOVX pathway. 

The two-hybrid system is based on the modular nature of most transcription factors, 
which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes 
two different DNA constructs. In one construct, the gene that codes for NOVX is fused to a 
gene encoding the DNA binding domain of a known transcription factor GAL-4). In the 
other construct, a DNA sequence, from a library of DNA sequences, tiiat encodes m 
unidentified protein (*'prey" or "sample") is fused to a gene that codes for the activation 
domain of the known transcription factor. If the "bait" and the "prey" proteins are able to 
interact, in vivo, fonning an NOVX-dependent complex, the DNA-binding and activation 
domains of the transcription factor are brought into close proximity. This proximity allows 
transcription of a reporter gene (e.g. , LacZ) that is operably linked to a transcriptional 
regulatory site responsive to the transcription factor. Expression of the reporter gene can be 
detected and cell colonies containing the functional transcription factor can be isolated and 
used to obtain the cloned gene tiiat emcodes the protein which interacts with NOVX. 
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The mvention further pertains to novel agents identified by the aforementioned 
screening assays and uses thereof for treatments as described herein* 



Detection Assays 

Portions or fragments of the cDNA sequences identified herein (and the corresponding 
5 complete gene sequences) can be used in numerous ways as polynucleotide reagents. By way 
of example, and not of limitation, these sequences can be used to: (0 map their respective 
genes on a chromosome; and, thus, locate gene regions associated with genetic disease; (it) 
identify an individual fix)m a minute biological sample (tissue typing); and (iii) aid in forensic 
identification of a biological sample. Some of these applications are described in the 
1 0 subsections, below. 

Chromosome Mapping I 

Once the sequence (or a portion of the sequence) of a gene has been isolated, this 
sequence can be used to imp the location of the gene on a chromosome* This process is called 

15 chromosome mapping. Accordingly, portions or fragments of the NOVX sequences, SEQ ID 
N0S:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, and 25, or fragments or derivatives thereof, can be 
used to map the location of the NOVX genes, respectively, on a chromosome. The mapping 
of the NOVX sequences to chromosomes is an hnportant first step in correlating these 
sequences with genes associated with disease. 

20 Briefly, NOVX genes can be mapped to chromosomes by preparing PGR primers 

(preferably 15-25 bp in length) j&om the NOVX sequences. Computer analysis of ttic NOVX, 
sequences can be used to rapidly select primers that do not span more than one exon in the 
genomic DNA, thus complicating the amplification process. These primers can then be used 
for PGR screening of somatic cell hybrids containing individual human chromosomes. Only 

25 those hybrids containing the human gene corresponding to the NOVX sequences will yield an 
amplified fragment. 

Somatic cell hybrids are prepared by fiising somatic cells from different mammals 
{e.g., human and mouse cells). As hybrids of human and mouse cells grow and divide, they 
gradually lose human chromosomes in random order, but retain the mouse chromosomes. By 
30 using media in which mouse cells cannot grow, because they lack a particular enzyme, but in 
which human cells can, the one human chromosome that contains the gene encoding the 
needed enzyme will be retained. By using various media, panels of hybrid cell lines can be 
established. Each cell line m a panel contains either a single human chromosome or a small 
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number of human chromosomes, and a fiill set of mouse chromosomes, allowing easy 
mapping of mdividual graes to specific human chromosomes. See, e.g., D'Eustaohio, et aL, 
1983. Science 220: 919-924. Somatic cell hybrids containing only fragments of human 
chromosomes can also be produced by using human chromosomes with translocations and 
deletions. 

PGR mapping of somatic cell hybrids is a rapid procedure for assigning a particular 
sequence to a particular chromosome. Three or more sequences can be assigned per day using 
a single thermal cycler. Using the NOVX sequences to design oligonucleotide primers, sub- 
locahzation can be achieved with panels of fragments from specific chromosomes. 

Fluorescence in situ hybridization (FISB) of a DNA sequence to a metaphase 
chromosomal spread can fiuther be used to provide a precise chromosomal location in one 
step. Chromosome spreads can be made using cells whose division has been blocked in 
metaphase by a chemical like colcemid that disrupts the mitotic ^indle. The chromosomes 
can be treated briefly with trypsin, and then stained with Giemsa^ Apattemof hght anddaric 
bands develops on each chromosome, so that the chromosomes can be identified individually. 
The FISH technique can be used with a DNA sequence as short a^ 500 or 600 bases. 
However, clones larger than 1,000 bases have a higher likelihood of binding to a unique 
chromosomal location with sufficient signal intensity for simple detection. Preferably 1,000 
bases, and more preferably 2,000 bases, will suffice to get good results at a reasonable amount 
of time. For a review of tiiis technique, see, Verma, et al. Human Chromosomes: A 
Manual of Basic Techniques (Petgamon Press, New York 1 988). 

Reagents for chromosome m£^ping can be used individually to mark a single 
chromosome or a single site on that chromosome, or panels of reagents can be used for 
marking multiple sites and/or multiple chromosomes. Reagents corresponding to noncoding 
regions of the genes actually are prefenred for mapping putposes. Coding sequences are more 
likely to be conserved within gene famihes, thus increasing the chance of cross hybridizations 
during chromosomal mapping* 

Once a sequence has been mapped to a precise chromosomal location, the physical 
position of the sequence on the chromosome can be correlated with genetic map data. Such 
data are found, e,g,, in McKusick, Mendblian Inhertfance in Man, available on-line 
through Johns Hopkins University Welch Medical Library), The relationship between genes 
and disease, mapped to the same chromosomal region, can then be identified through linkage 
analysis (co-inheritance of physically adjacent genes), described in, e.g., Egeland, et al, 1987. 
Nature, 325: 783-787. 
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Moreover, differences in the DNA sequences between individuals affected and 
unaffected with a disease associated with the NOVX gene, can be detennined. If a mutation is 
observed in some or all of the affected individuals but not in any unaffected individuals, then 
the mutation is likely to be the causative agait of tiie particular disease. Comparison of 
affected and unaffected individuals generally involves first looking for structural alterations in 
the chromosomes, such as deletions or translocations that are visible from chromosome 
spreads or detectable using PGR based on that DNA sequence. Ultimately, complete 
sequencing of genes from several individuals can be performed to confirm the presence of a 
mutation and to distinguish mutations from polymorphisms. 

Tissue Typing 

The NOVX jsequences of the invention can also be used to identify individuals from 
minute biological samples. In this technique, an individual's gen0mic DNA is digested with 
one or more restriction enzymes, and probed on a Southern blot to yield unique bands for 
identification. The sequences of the invention are useful as additional DNA markers for RFLP 
("restriction fragment leaigth polymorphisms," described in U.S. Patent No. 5,272,057), 

Furthennore, the sequraces of the invention can be used to provide an alternative 
technique that determines the actual base-by-base DNA sequence of selected portions of an 
mdividual's genome. Thus, the NOVX sequences described herein can be used to prepare two 
PGR primers from the 5*- and 3'-termini of the sequences. These primers can then be used to 
amplify an individual's DNA and subsequently sequence it ■ 

Panels of corresponding DNA sequences from individu^s, prepared in this manner, 
can provide unique individual identifications, as each individual will have a miique set of ^ch 
DNA sequences due to allehc differences. The sequences of the invention can be used to 
obtain such identification sequences from individuals and from tissue. The NOVX sequences 
of the inv^tion uniquely represent portions of the human genome. Allelic variation occurs to 
some degree in the coding regions of these sequences, and to a greater degree in the noncoding 
regions. It is estimated that allelic variation between individual humans occurs with a 
frequency of about once per each 500 bases. Much of the allelic variation is due to single 
nucleotide polymorphisms (SNPs), which include restriction fragment length polymoiphisms 
(RFLPs), 

Each of the sequences described herein can, to some degree, be used as a standard 
against which DNA from an individual can be compared for identification purposes. Because 
greater numbers of polymorphisms occur in the noncoding regions, fewer sequences are 
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necessary to differcntiate individuals. The noncoding sequences can comfortably provide 
positive individual identification wiiii a panel of perhaps 10 to 1,000 primers that each yield a 
noncoding amplified sequence of 100 bases. If predicted coding sequences, such as those in 
SEQ ID NOS:l, 3, S, 7, 9, 1 1, 13, IS. 17, 19, 21, 23, and 25 are used, a more appropriate 
number of primers for positive individual identification would be 500-2,000. 

Predictive Medicine 

The mvention also pertains to the field of predictive medicine in which diagnostic 
assays, prognostic assays, phaimacogenomics, and monitoring chnical trials are used' for 
prognostic (predictive) purposes to thereby treat an individual prophylactically. Accordingly, 
one aspect of tiie invention relates to diagnostic assays for determining NOVX protein and/or 
nucleic acid expression as weU as NOVX activity, in the context of a biological sample (e.g., 
blood, serum, cells, tissue) to tiiereby determine whether an individual is afflicted with a 
disease or disorder, or is at risk of developing a disorder, associated with aberrant NOVX 
expression or activity. The disorders include metabolic disoniers, diabetes, obesity, infectious 
disease, anorexia, cancer-associated cachexia, cancer, neurodegenerative disorders, 
Alzheimer's Disease, Parkinson's Disorder, immune disorders, and hematopoietic disorders, 
and the various dysUpidemias, metaboKq disturbancea associated with obesity, the metaboUc 
syndrome X and wasting disorders associated with chronic diseases and various cancers. The 
invention also provides for prognostic (or predictive) assays for determining whether an 
individual is at risk of developing a disorder associated with NOVX protein, nucleic acid 
expression or activity. For example, mutations in an NOVX gene can be Assayed m a 
biological sample. Such assays can be used for prognostic or predictive purpose to thereby 
prophylactically ti^at an individual prior to the onset of a disorder characterized by or 
associated with NOVX protein, nucleic acid expression, or biological activity. 

Another aspect of the invention provides methods for determining NOVX protein, 
nucleic acid expression or activity in an individual to thereby select appropriate therapeutic or 
prophylactic agents for that individual (referred to herein as "pharmacogenoraics"). 
Pharmacogenomics allows for the selection of agents (e.g., drags) for tiierapeutic or 
prophylactic tfeatinent of an individual based on the genotype of the individual {e.g., tiie 
genotype of the individual examined to determine the ability of the individual to respond to a 
particular agent.) 

Yet another aspect of the invention pertains to monitoring the infiuence of agents {e.g., 
drugs, compounds) on the expression or activity of NOVX in clinical trials. 
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These and other agents are described in fiirther detail in flie following sections. 



Diagnostic Assays 

An exemplary method for detecting the presence or absence of NOVX in a biological 
sample involves obtaining a biological sample from a test subject and contacting the biological 
sample with a compound or an agent capable of detecting NOVX protein or nucleic acid (e.^., 
mRNA, genomic DNA) that encodes NOVX protein such that the presence of NOVX is 
detected in the biological sample. An agent for detecting NOVX mRNA or genomic DNA is a 
labeled nucleic acid probe capable of hybridizing to NOVX mRNA or genomic DNA. The 
nucleic acid probe can be, for exan^le, a full-Ioigth NOVX nucledc acid, such as the nucleic 
acidofSEQlDNOS:l,3, 5,7,9. II, 13, 15, 17. 19, 21, 23, and 25, or a portion thereof, such 
as an oligonucleotide of at least 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient 
to specifically hybridize under stringent conditions to NOVX n)(RNA or genomic DNA. Other 
suitable probes for use in the diagnostic assays of the invention are described herein. 

An agent for detecting NOVX protein is an antibody capable of binding to NOVX 
protein, preferably an antibody with a detectable label. Antibodies can be polyclonal, or more 
preferably, monoclonal. An intact antibody, or a fra^ent thereof (e.g.. Fab or F(ab')2) can be 
used, nie term "labeled", with regard to the probe or antibody, is intended to encompass 
direct labeling of the probe or antibody by coupling (i.e., physically linking) a detectable 
substance to ttie probe or antibody, as well as indirect labeling of the probe or antibody by 
reactivity with another reagent that is directly labeled. Examples of indirect labeling include 
detection of a primary antibody using a fluorescently-labeled secondary antibody and 
end-labeling of a DNA probe with biotin such that it can be detected with fluorescently- 
labeled streptavidin. The term "biological sample" is intended to include tissues, cells and 
biological fluids isolated from a subject, as well as tissues, cells and fluids present within a 
subject. That is, the detection method of the invention can be used to detect NOVX mRNA, 
protein, or genomic DNA in a biological sample in vitro as well as in vivo. For example, in 
vitro techniques for detection of NOVX mRNA include Northern hybridizations and in situ 
hybridizations. In vitro techniques for detection of NOVX protein include enzyme linked 
immunosorbent assays (ELISAs), Western blots, immunoprecipitations, and 
immunofluorescence. In vitro techniques for detection of NOVX genomic DNA include 
Southern hybridizations. Furthermore, in vivo techniques for detection of NOVX protein 
include introducing into a subject a labeled anti-NOVX antibody. For example, the antibody 
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can be labeled with a radioactive maricer whose presence and location in a subject can be 
detected by standard imaging techniques. 

In one embodiment, the biological sample contains protein molecules from the test 
subject. Alternatively, the biological sample can contain mRNA molecules from the test 
subject or genomic DNA molecules from the test subject. A preferred biological sample is a 
peripheral blood leukocyte sample isolated by conventional means from a subject. 

In another embodiment, the methods &rther involve obtaining a control biological 
sample from a control subject, contacting the conftol sample with a compound or agent 
capable of detecting NOVX protein, mRNA, or genomic DNA, such that the presence 'of 
NOVX protein, mRNA or genomic DNA is detected in the biological sample, and comparing 
the presence of NOVX protein, n*NA or genomic DNA in the control sample with the 
presence of NOVX piotehu, mKNA or genomic DNA in the test sample. 

The invention also encompasses kits for detecting the presence of NOVX in a 
biological sample. For example, the kit can comprise: a labeled compound or agent capable of 
detecting NOVX protein or mRNA in a biological sample; means for determining the amount 
of NOVX in the sample; and means for comparing the amount of NOVX in the sample with a 
standard. The compound or agent can be packaged in a suitable container. The kit can fiuthor 
comprise instmctions for using the kit to detect NOVX protein or nucleic add. 

Prognostic Assays 

The diagnostic methods described herein can furtheimore be utilized to identify 
subjects having or at risk of developing a disease or disorder associated with aberrant NOVX 
expression or activity. For example, the assays described herein, such as the preceding 
diagnostic assays or the following assays, can be utilized to identify a subject having or at risk 
of developing a disorder associated with NOVX protein, nucleic acid expression or activity. 
Alternatively, the prognostic assays can be utilized to identify a subject having or at risk for 
developing a disease or disorder. Thus, the invention provides a method for identifying a 
disease or disorder associated with abenrant NOVX expression or activity in which a test 
sample is obtamed from a subject and NOVX protein or nucleic acid (e.g. , mRNA, genomic 
DNA) is detected, wherein the presence of NOVX protein or nucleic acid is diagnostic for a 
subject having or at risk of developing a disease or disorder associated with aberrant NOVX 
expression or activity. As used herein, a "test sample" refers to a biological sample obtained 
from a subject of interest. For example, a test sample can be a biological fluid (e.g., semm), 
cell sample, or tissue. 
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Furthennore, the prognostic assays described herein can be used to detennine whether 
a subject can be adminislered an agent (e.g., an agonist, antagonist, p^tidomiraeiic, protein, 
peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disonJer 
associated with aberrant NOVX expression or activity. For example, such methods can be 
used to determine whether a subject can be effectively treated with an agent for a disorder. 
Thus, the invention provides methods for determining whether a subject can be effectively 
treated with an agent for a disonier associated with aberrant NOVX expression or actiyity in 
which a test sample is obtained and NOVX protein or nucleic acid is detected (e.g., wherein 
the presence of NOVX protein or nucleic acid is diagnostic for a subject that can be ' 
administered the agent to treat a disorder associated with aberrant NOVX expression or 
activity). 

Hie methods of the invention can also be used to detect genetic lesions in an NOVX 
gene, thereby determining if a subject with the lesioned gene is a^ risk for a disorder 
characterized by aberrant cell proliferation and/or differentiation. In various embodiments, the 
methods include detecting, in a sample of cells &om the subject, the presence or absence of a 
genetic lesion characterized by at least one of an alteration affecting the integrity of a gene 
encoding an NOVX-pnotein, or the misexpression of the NOVX gene. For example, such 
genetic lesions can be detected by ascertaining the existence of at least one of; (i) a deletion of 
one or more nucleotides from an NOVX gene; («) an addition of one or more nucleotides to an 
NOVX g^e; {iii) a substitution of one or more nucleotides of an NOVX gene, (/v) a 
chromosamal rearrangement of an NOVX gene; (v) an alteration in the level of a messenger 
RNA transcript of an NOVX gene, (vj) aberrant modification of an NOVX gene, such as of the 
methylation pattern of the genomic DNA, (vii) the presence of a noh-wild-type splicing pattern 
of a messenger RNA transcript of an NOVX gene, {viii) a non-wild-type level of an NOVX 
protein, (ix) allelic loss of an NOVX gene, and (x) inappmpriate post-translational 
modification of an NOVX protein. As described herein, there are a large number of assay 
techniques known in the art which can be used for detecting lesions jn an NOVX gene. A 
preferred biological sample is a periphery blood leukocyte sample isolated by conventional 
means from a subject. However, any biological sample containing nucleated cells may be 
used, including, for example, buccal mucosal cells. 

In certain embodiments, detection of the lesion involves the use of a probe/primer in a 
polymerase chain reaction (PGR) (see. e.g., U.S. Patent Nos. 4,683,195 and 4,683,202), such 
as anchor PGR or RACE PGR, or, alternatively, in a ligation chain reaction (LCR) (see, e.g., 
Landegran, et aL, 1988. Science 241: 1077-1080; and Nakazawa, et al., 1994. Proc. Natl 



173 



wo OJ/74851 PCT/USOl/10039 

Acad. Sci. USA 91: 360-364), the latter of which can be particularly useful for detecting point 
mutationa in the NOVX-gene (see, Abiavaya, et al, 1995. Nml. Acids Res. 23: 675-682).^ 
This method can include the steps of collecting a sample of cells ftom a patient, isolating 
nucleic acid (e.g., genomic, mRNA or both) fiom the cells of the sample, contacting the 
nucleic acid sample with one or more primers that specifically hybridize to an NOVX gene 
under conditions such that hybridization and amplification of the NOVX gene (if present) 
occurs, and detecting the presence or absence of an amplification product, or detectmg the size 
of the amplification product and comparing the length to a control sample. It is anticipated 
that PCR and/or LCR may be desirable to use as a preliminaty anyjlification step in ' 
conjunction with any of the techniques used for detecting mutations desraibed herein. 

Alternative amplification methods include: self sustained sequence repUcation (see, 
GuateUi, etal., 1990. Proc. NatL Acad. Sci. USA 87: 1874-1878), transcriptional amplification 
system (see. Kwoh, et al., 1989. Proc. Natl. Acad. Sci. USA 86: J 173-1 177); Qp Replicase 
(see, Lizardi, et al, 1988. BioTechnohgy 6: 1 197), or any other nucleic acid amplification 
method, followed by the detection of the amplified molecules using techniques well known to 
those of skill in the art. These detection schemes are especially usefiil for the detection of 
nucleic acid molecules if such molecules are present in very low numbers. 

In an alternative embodiment, mutations in an NOVX gene from a sample cell can be 
identified by alterations in restriction enzyme cleavage patterns. For example, sample and 
control DNA is isolated, ampUfied (optionally), digested with one or more restriction 
endonucleases, and firagment length sizes are determined by gel electrophoresis and compared. 
Differences in fiagment length sizes between sample and control DNA indicates mutations in 
the sample DNA. Moreover, the use of sequence specific ribozymes (see, e.g., U.S. Patent 
No. 5,493,531) can be used to score for the presence of specific mutations by development or 
loss of a ribozyme cleavage site. 

In other embodiments, genetic mutations in NOVX can be identified by hybridizing a 
sample and control nucleic acids, e.g., DNA or RNA, to high-density anrays containing 
hundreds or ttiousands of oUgonucleotides probes. See, e.g., Cronin, et al., 1996. Human 
Mutation 7: 244-255; Kozal, etal., 1996. Nat. Med 2: 753-759. For example, genetic 
mutations in NOVX can be identified in two dimensional arrays containing light-generated 
DNA probes as described in Cronin, et al., supra. Briefly, a first hybridization array of ptobes 
can be used to scan through long stretches of DNA in a sample and control to identify base 
changes between the sequences by making hnear arrays of sequential overlapping probes. 
This step allows the identification of point mutations. This is followed by a second 
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hybridization array that allows tiie characterization of specific mutations by using staaller, 
specialized probe arrays complementaiy to all variants or mutations detected. Each mutation 
array is composed of parallel probe sets, one complementary to the wild-type gene and the 
other complementary to the mutant gene. 

In yet another embodiment, any of a variety of sequencing reactions known in the art 
can be used to directly sequence the NOVX gene and detect mutations by comparing the 
sequence of the sample NOVX with the corresponding wUd-type (control) sequence. 
Examples of sequencing reactions include those based on techniques developed by Maxim and 
Gilbert, 1977. Proc. Natl Acad. Set. USA 74: 560 or Sanger, 1911. Proc. Natl. Acad. Sci. USA 
74: 5463. It is also contemplated Itat any of a variety of automated sequencing procedures 
can be utilized when performing the diagnostic assays (see, e.g., Naeve, et at, 1995. 
Biotechniques 19: 448), including sequencing by mass spectrometry (see, eg., PCT 
International Publication No. WO 94/16101; Cohen, et ai, 1996^ Adv. Chromatography 36: 
127-162; and Griffin, et ah, 1993. Appl. Biochem. Biotechnol 38: 147-159), 

Other methods for detecting mutations in the NOVX gene include methods in which 
protection firom cleavage agents is used to detect mismatched bases in RNA/RNA or 
RNA/DNA heteroduplexes. See. e.g., Myers, etal., 1985. Science 230: 1242. In g/sneral, the 
art technique of "mismatch cleavage" starts by providing heteroduplexes of formed by 
hybridizing (labeled) RNA or DNA containing the wild-type NOVX sequence with potentially 
mutant RNA or DNA obtained from a tissue sample. The double-stranded duplexes are 
treated with an agent that cleaves single-stranded regions of the duplex such as which will 
exist due to basepair mismatches between the control and sample strands. For instance, 
RNA/DNA duplexes can be treated with KNase and DNA/DNA hybrids treated with Si 
nuclease to enzymatically digesting the mismatched regions. In other embodiments, either 
DNA(DNA or RNA/DNA duplexes can be treated with hydroxylamine or osmium tetroxide 
and with piperidine in order to digest mismatched regions. After digestion of the mismatched 
regions, the resulting material is then separated by size on denaturing polyacrylamide gels to 
determine the site of mutation. See, e.g.. Cotton, et al, 1988. Proc. Natl. Acad. Sci. USA 85: 
4397; Saleeba, et al. 1992. Methods Enzymol 217: 286-295. In an embodiment, the control 
DNA or RNA can be labeled for detection. 

In still another embodiment, the mismatch cleavage reaction employs one or more 
proteins that recognize mismatched base pairs m double-stranded DNA (so called "DNA 
mismatch repair" enzymes) in defined systems for detecting and mapping point mutations in 
NOVX cDNAs obtained from samples of cells. For example, the mutY enzyme of ^. coli 
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deaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T 
at G/T mismatches. See, e.g., Hsu, etal, 1994. Carcinogenesis 15: 1657-1662. According to 
an exemplary emhodiment, a probe based on an NOVX sequence, e.g., a wild-type NOVX 
sequence, is hybridized to a cDNA or other DNA product from a test cell(s). The duplex is 
treated with a DNA mismatch repair enzyme, and the cleavage products, if any, can be 
detected from electrophoresis protocols or the like. See, e.g., U.S. Patent No. 5,459,039. 

In other embodiments, alterations in electrophoretic mobility will be used to identify 
mutations in NOVX genes. For example, single strand conformation polymorphism (SSCP) 
may be used to detect differences in electrophoretic mobility between mutant and wild type 
nucleic acids. See, e.g., Orita, etal, 1989. Proc, Natl Acad. Sci. USA: 86: 2766; Cotton, 
1993. Mutat. Res. 285: 125-144; Hayashi, 1992. Genet. Anal Tech. Appl 9: 73-79. 
Single-stranded DNA fragments of sample and control NOVX nucleic acids will be denatured 
and allowed to renature. The secondary structure of single-straniijLed nucleic acids varies 
according to sequence, the resulting alteration in electrophoretic mobility enables the detection 
of even a single base change. The DNA fragments may be labeled or detected with labeled 
probes. The sensitivity of the assay may be enhanced by using RNA (rather than DNA), in 
which the secondary structure is more sensitive to a dbange in sequence. In one embodiment, 
the subject method utilizes heteroduplex analysis to separate double stranded het«roduplex 
molecules on the basis of changes in electrophoretic mobility. See, e.g.. Keen, etal, 1991. 
Trends Genet, 7: 5, 

In yet another embodiment, the movement of mutant or wild-type fragments in 
polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient 
gel electrophoresis (DOGE). See, e.g..Uycis, etal, 1985. Mi/wre 313: 495. WhenDGGEis 
used as the method of analysis, DNA will be modified to insure that it does not completely 
denature, for example by adding a GC clamp of approximately 40 bp of high-melting GC-iich 
DNA by PCR, In a fiirther embodiment, a temperature gradient is used in place of a 
denaturing gradient to identify differences in the mobility of control and sample DNA. See. 
e.g.. Rosenbaum and Reissner, 1987. Biophys. Chem. 265: 12753. 

Examples of other techniques for detecting point mutations include, but are not limited 
to, selective oligonucleotide hybridization, selective amplification, or selective primer 
extension. For example, oligonucleotide primers may be prepared in which the known 
mutation is placed centrally and then hybridized to target DNA under conditions that permit 
hybridization only if a perfect match is found. See, e.g., Saiki, et al, 1986. Nature 324: 163; 
Saiki, et al, 1989, Proc. Natl Acad. Sci. USA 86: 6230. Such allele specific oligonucleotides 
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are hybridized to PGR amplified target DNA or a number of dififemit mutationfi when the 
oligonucleotides are attached to the hybridi2dng membrane and hybridized with labeled target 
DNA. 

Alternatively, allele specific amplification technology that depends on selective PGR 
amplification may be used in conjunction with the instant invention. Oligonucleotides used as 
primers for specific amplification may carry the mutation of bterest m the center of the 
molecule (so that amplification depends on differential hybridization; see, e.g., Gibbs, et al., 
1989. Nucl. Acids Res. 17: 2437-2448) or at the extreme 3'-termmus of one primer where, 
under appropriate conditions, mismatch can prevent, or reduce polymerase extension (see, e.g.. 
Prossner, 1993. Tibtech. 11: 238). In addition it may be desirable to introduce a novel 
restriction site in the region of the mutation to create cleavage-based detection. See, e.g„ 
Gasparini, et al, 1992. Mol Cell Probes 6: 1 . It is anticipated that in certain embodunents 
amplification may also be performed using Taq Ugaso for ampli ^cation. See. e.g.. Barany, 
1991. Proc. Natl. Acad. Set USA 88: 189. In such cases, ligation will occur only if there is a 
perfect match at the S'-terminus of the 5' sequence, makmg it possible to detect the presence of 
a known mutation at a specific site by looking for the presence or absence of amplification. 

The methods described herem maybe performed, for example, by utiliang 
pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent 
described herein, which may be conveniently used, e.g., in clinical settmgs to diagnose 
patients exhibiting symptoms or family history of a disease or ilhiess involving an NOVX 
gene. 

Furthermore, any ccU type or tissue, preferably peripheral blood leukocytes, in which 
NOVX is expressed may be utilized in the prognostic assays described herem. However, any 
biological sample containmg nucleated cells may be used, includmg, for example, buccal 
mucosal cells. 

Pharmacogenomics 

Agraits, or modulators that have a stimulatory or inhibitory effect on NOVX activity 
(e.g., NOVX gene expression), as identified by a screenmg assay described herein can be 
administered to mdividuals to treat (prophylacticaUy or ther^eutically) disorders (The 
disorders include metaboUc disorders, diabetes, obesity, infectious disease, anorexia, cancer- 
associated cachexia, cancer, neurodegenerative disorders, Alzheuner's Disease, Parkmson's 
Disorder, immune disorders, and hematopoietic disorders, and the various dyslipidemias, 
metabolic disturbances associated with obesity, the metaboUc syndrome X and wasting 
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disorders associated with chronic diseases and various cancers.) In conjunction with such 
treatmeht, the pharmacogenomics (ie., the study of the relationship between an individual s 
genotjpe and that individual's response to a foreign compound or drug) of the individual may 
be considered, Differences in metabolism of therapeutics can lead to severe toxicity or 
therapeutic failure by altering the relation between dose and blood concentration of the 
pharmacologically active drug. Thus, the pharmacogcnomics of the individual permits the 
selection of effective agents {e.g,, drugs) for prophylactic or therapeutic treatments based on a 
consideration of the individual's genotype* Such pharmacogenomics can further be used to 
detennine appropriate dosages and therapeutic regimens. Accordingly, the activity of ikfOVX 
protein, expression of NOVX nucleic acid, or mutation content of NOVX genes in an 
individual can be determined to thereby select appropriate agent(s) for therapeutic or 
prophylactic treatment of the individual. 

Phannacogenomics deals with clinically significant hereditary variations in the 
response to drugs due to altered drug disposition and abnormal action in affected persons. See 
e.g., Eichelbaum, 1996, Clin. Pharmacol Physiol, 23: 983-985; Linder, 1997. C/m." 
Chem., 43: 254-266. In general, two types of pharmacogenetic conditions can be 
differentiated. Genetic conditions transmitted as a single fector altering the way drugs act on 
the body (altered drug action) or genetic conditions transmitted as single factors altering the 
way the body acts on drugs (altered drug metaboHsm). These pharmacogenetic conditions can 
occur dther as rare defects or as poiymorphisms. For example, glucose-6-phosphate 
dehydrogenase (G6PD) deficiency is a conmaon inherited enzymopathy in which the main 
clinical complication is hemolysis after ingestion of oxidant drugs (anti-mdarials, 
sulfonamides, analgesics, nitrofurans) and consumption of fava beans. 

As an illustrative embodiment, the activity of drug metabolizing enzymes is a major 
determinant of both the intensity and duration of drug action. The discovery of genetic 
polymorphisms of drug metabolizing enzymes (e.g., N-acetyltransferase 2 (NAT 2) and 
cytochrome P450 enzymes CYP2D6 and CYP2C19) has provided an explanation as to why 
some patients do not obtain tihe expected drug effects or show exaggerated drug response and 
serious toxicity after taking the standard and safe dose of a drug. These polymorphisms are 
expressed in two phenotypes in the population, the extensive metaboHzer (EM) and poor 
metaboUzer (PM). The prevalence of PM is different among different populations. For 
example, the gene codmg for CYP2D6 is highly polymorphic and several mutations have been 
identified in PM, which all lead to the absence of fimctional CYP2D6, Poor metabolize of 
CYP2D6 and CYP2C19 quite fi-equently experience exaggerated drug response and side 
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effects when they receive standard doses. If a metabolite is the active ther^utic moiety, PM 
show no therapeutic response, as demonsttated for the analgesic effect of codeine mediated b> 
its CYP2D6-fonned metabolite morphine. At the other extreme are the so called ultra-rapid 
metabolizers who do not respond to standard doses. Recently, the molecular basis of 
ultra-rapid metabolism has been identified to be due to CYP2D6 gene amplification. 

Thus, the activity of NOVX protein, expression of NOVX nucleic acid, or mutation 
content of NOVX genes in an individual can be determined to thereby select appropriate 
agent(s) for therapeutic or prophylactic treatment of the individual. In addition, 
phaimacogenetic studies can be used to apply genotyping of polymoiphic alleles encoding 
drug-metabolizing enzymes to the identification of an individual's drug responsiveness 
phenotype. This knowledge, when applied to dosing or drug selection, can avoid adverse 
reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency when 
treating a subject with an NOVX modulator, such as a modulatoij identified by one of the 
exemplary screening assays described herein. 

Monitoring of Effects During Clinical Trials 

Monitoring the influence of agents {e.g., drugs, compounds) on the expression or 
activity of NOVX {e.g., the ability to modulate aberrant cell proliferation and/or 
differentiation) can be appHed not only in basic drug screening, but also in clinical trials. For 
example, the effectiveness of an agent determined by a screening assay as described herein to 
increase NOVX gene eapression, protein levels, or upregulate NOVX activity, can be 
monitored in clinical trails of subjects exhibiting decreased NOVX gene expression, protein 
levels, or downregulated NOVX activity. Alternatively, the effectiveness of an agent 
determined by a screening assay to decrease NOVX gene expression, protein levels, or 
downregulate NOVX activity, can be monitored in clinical trails of subjects exhibiting 
increased NOVX gene expression, protein levels, or upregulated NOVX activity. In such 
clinical trials, the expression or activity of NOVX and, preferably, other genes that have been 
impUcated in, for example, a cellular proliferation or immune disorder can be used as a "read 
out" or markers of the immune responsiveness of a particular cell. 

By way of example, and not of limitation, genes, including NOVX, that are modulated 
in ceUs by treatment with an agent (eg., compound, drug or small molecule) that modulates 
NOVX activity {e.g„ identified in a screening assay as described herein) can be identified. 
Thus, to study the effect of agents on cellular proliferation disorders, for example, in a clinical 
trial, cells can be isolated and KNA prepared and analyzed for the levels of expression of 
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NOVX and other genes implicated in the disorder. The levels of gene expression (i.e., a gene 
expression pattern) can be quantified by Northern blot analysis or RT-PCR, as described 
herein, or alternatively by measuring the amount of protein produced, by one of the methods 
as described herein, or by measuring the levels of activity of NOVX or other genes. In this 
manner, the gene expression pattern can serve as a maiker, indicative of the physiological 
reqwnse of the cells to the agent. Accordingly, this response state may be detennined before, 
and at various points during, treatment of the individual with the agent. 

In one embodunent, the invention provides a method for monitoring the effectiveness 
of treatment of a subject with an agent (e.^., an agonisl; antagonist, protein, peptide, ■ 
peptidomimetic, nucleic acid, small molecule, or other drug candidate identified by the 
screening assays described herein) comprising the steps of (0 obtaining a pte-administration 
sample &om a subject prior to administration of the agent; (ii) detecting the level of expression 
of an NOVX protein, mKNA, or genomic DNA in the preadmin^sttation sample; obtaining 
one or more post-administration samples fiiom the subject; (iv) detecting the level of 
expression or activity of the NOVX protein, mRNA, or genomic DNA in the 
post-administration samples; (v) comparing the level of expression or activity of the NOVX 
protein, mRNA, or genomic DNA in the pre-administration sample with the NOVX protein, 
inRNA, or genomic DNA in the post administration sample or samples; and (vi) ^tering the 
administration of the agent to the subject accordingly. For example, increased administration 
of the agent may be desirable to incr^se the expression or activity of NOVX to higher levels 
tiian detected, j.e., to incaease the effectiveness of the agent. Alternatively, decreased , 
administration of the agent may be desirable to decrease esxpression or activity of NOVX to 
lower levels than detected, z.e., to decrease the effectiveness of the agent. 

Methods of Treatment 

The invention provides for both prophylactic and therapeutic methods of treating a 
subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant 
NOVX expression or activity. The disord^s include cardiomyopathy, atherosclerosis, 
hypertension, congenital heart defects, aortic stenosis, atrial septal defect (ASD), 
atrioveatricular (A-V) canal defect, ductus arteriosus, puhnonary stenosis, subaortic stenosis, 
ventricular septal defect (VSD), valve diseases, tuberous sclerosis, scleroderma, obesity, 
transplantation, adrenoleukodystrophy, congenital adrenal hyperplasia, prostate cancer, 
neoplasm; adenocarcinoma, lymphoma, uterus cancer, fertility, hemophilia, hypercoagulation, 
idiopathic thrombocytopenic purpura, immunodeficiencies, graft versus host disease, AIDS, 
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bronchial asthma, Crohn's disease; multiple sclerosis, treatment of Albright Hereditary 
Ostoeodystrophy, and other diseases, disordeis and conditions of the like. 
These methods of treatment will be discussed more fully, below. 

Disease and Disorders 

Diseases and disorders that are characterized by increased (relative to a subject not 
suffering from the disease or disorder) levels or biological activity may be treated \wth 
Therapeutics that antagonize (i.e., reduce or inhibit) activity. Therapeutics that antagonize 
activity may be administered in a therapeutic or prophylactic manner. Therapeutics that may 
be utilized include, but are not limited to; (0 an aforanentioned peptide, or analogs, 
derivatives, fragments or homologs thereof; («) antibodies to an aforementioned peptide; (iii) 
nucleic acids encoding an aforementioned pqjtide; (iv) administration of antisense nucleic acid 
and nucleic acids that are "dysfimctional" (i.e.. due to a heterolcjgous insertion within the 
coding sequences of coding sequences to an aforementioned peptide) that are utilized to 
"knockout" endogenous fimction of an aforementioned pepMo by homologous recombination 
(see, e.g., Capecchi, 1989. Science 244: 1288-1292); or (v) modulators ( i.e., inhibitors, 
agonists and antagonists, includmg additional peptide mimetic of the invention or antibodies 
specific to a peptide of the invention) that alter the interaction between an aforementioned 
peptide and its binding partner. 

Diseases and disorders that are charactraized by deraeased (relative to a subject not 
suffering from the disease or disorder) levels or biological activity may be treated with 
Ther^eutics that increase (i.e., are agonists to) activity. Therapeutics that upregulate activity 
may be administered in a tiierapeutic or prophylactic manner. Therapeutics tiiat may be 
utilized include, but are not limited to, an aforementioned peptide, or analogs, derivatives, 
fragments or homologs tiiereof; or an agonist fliat increases bioavailability, 

Increased or decreased levels can be readily detected by quantifying peptide and/or 
RNA, by obtaining a patient tissue sample (e.g„ &om. biopsy tissue) and assaying it in vitro for 
RNA or peptide levels, structure and/or activity of the expressed peptides (or mRNAs of an 
aforementioned peptide). Methods that are well-known within the art include, but are not 
limited to, immunoassays (e,g., by Western blot analysis, immuuoprecipitation followed by 
sodium dodecyl sulfate (SDS) polyacrylamide gel electrophoresis, immunocytochemistry, etc.) 
and/or hybridization assays to detect expression of mRNAs (e.g.. Northern assays, dot blots, in 
situ hybridization, and the like). 
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Prophylactic Methods 

In one aspect, the invention provides a method for preventing, in a subject, a disease or 
condition associated witii an aberrant NOVX expression or activity, by administering to the 
subject an agent that modulates NOVX expression or at least one NOVX activity, Subjects at 
risk for a disease that is caused or contributed to by aberrant NOVX expression or activity can 
be idmtified by, for example, any or a combination of diagnostic or prognostic assays as 
described herein. Administration of a prophylactic agent can occur prior to the manifestation 
of symptoms characteristic of the NO\OC aberrancy, such that a disease or disorder is 
prevented or, alternatively, delayed in its progression. Depending upon the type of NOVX 
aberrancy, for example, an NOVX agonist or NOVX antagonist agent can be used for ^ting 
the subject The appropriate agent can be determined based on screening assays described 
herein. The prophylactic methods of the invention are further di^f ussed in the following 
subsections, 

Therapeutic Methods 

Another aspect of the invention pertains to methods of modulating NOVX expression 
or activity for therapeutic purposes. The modulatory method of the invention involves 
contacting a cell with an agent that modulates one or more of the activities of NOVX protein 
activity associated with tiie cell. An agent that modulates NOVX protein activity can be an 
agent as described herein, such as a nucleic acid or a protein, a naturally-occurring cognate 
ligand of an NOVX protein, a peptide, an NOVX peptidomimetic, or other smaU molecula In 
one embodiment, the agent stimulates one or more NOVX protein activity. Examples of such 
stimulatory agents include active NOVX protein and a nucleic acid molecule encoding NOVX 
that has been introduced into the cell. In another embodiment, the agent inhibits one or more 
NOVX protein activity, Examples of such inhibitory agents include antisense NOVX nucleic 
acid molecules and anti^NOVX antibodies. These modulatory methods can be performed in 
vitro {e.g., by culturing the cell with the agent) or, alternatively, in vivo (e,g., by administering 
the agent to a subject). As such, the invention provides methods of treating an individual 
afflicted with a disease or disorder characterized by aborant expression or activity of an 
NOVX protein or nucleic acid molecule. In one embodiment, the method involves 
administering an agent (e.g., an agent identified by a screening assay described herein), or 
combination of agents that modulates (e.g., up-regulates or down-regulates) NOVX e^spression 
or activity. In another embodiment, the method involves administering an NOVX protein or 
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nucleic acid molecule as therapy to compensate for reduced or aberrant NOVX expression or 

activity. 

Stimulation of NOVX activity is desirable in situations in which NOVX is abnormally 
downregulated and/or in which increased NOVX activity is likely to have a beneficial effect* 
5 One example of such a situation is where a subject has a disorder characterized by aberrant 
cell proliferation and/or differentiation (e.g.^ cancer or immune associated disorders). Another 
example of such a situation is where the subject has a gestational disease (e.g., preclampsia)* 

Determmation of the Biological Effect of the Therapeutic 

In various embodiments of the invaition, suitable in vitro or in vivo assays are 
10 performed to determine the effect of a specific Ther^eutic and whether its administration is 
indicated for treatment of the affected tissue. 

In various specific embodiments^ in vitro assays may be p^rfonned with representative 
cells of the type(s) involved in the patienfs disorder, to determine if a given Therapeutic exerts 
the desired effect upon the cell type(s). Compounds for use in therapy may be tested in 
15 suitable animal model systems uicluding, but not limited to rats, mice, chicken, cows, 
monkeys, rabbits, and the hke, prior to testing in human subjects. Similarly, for in vivo 
testing, any of the animal model system known in the art may be used prior to administration 
to human subjects. 

Prophylactic and Therapeutic Uses of the Compositions of the Invention 

20 The NOVX nucleic acids and proteins of the invention are xiseful in potential 

prophylactic and thenqjeutic applications implicated in a variety of disorders mcluding, but not 
limited to: metabolic disorders, diabetes, obesity, infectious disease, anorexia, cancer- 
associated cancer, neurodegenerative disorders, Alzheimer's Disease, Parkinson's Disorder, 
immune disorders, hematopoietic disorders, and the various dyslipidemias, metabolic 

25 disturbances associated with obesity, the metabolic syndrome X and wasting disorders 
associated with chronic diseases and various cancers. 

As an example, a cDNA encoding the NOVX protein of the invention may be useful in 
gene tiierapy, aad the protein may be usefiil when administered to a subject in need thereof. 
By way of non-Umiting example, the compositions of the invention will have efficacy for 

30 treatment of patients suffering from: metabolic disorders, diabetes, obesity, infectious disease, 
anorexia, canc^-associated cachexia, cancer, neurodegenerative disorders, ALdieimer's 
Disease, Parkinson's Disorder, immune disorders, hematopoietic disorders, and the various 
dyslipidemias. 
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Both the novel nucleic acid encoding the NOVX protein, and the NOVX protein of the 
invention, or fragments thereof may also be useful in diagnostic applications, wherein the > 
presence or amount of the nucleic acid or the protein are to be assessed. A further use could 
be as an anti-bacterial molecule (i.e., some peptides have been found to possess anti-bacterial 
5 properties). These materials are further useful in the graeration of antibodies which 

immunospecifically-bind to the novel substances of the invention for use in therapeutic or 
diagnostic methods* 



EQUXVALENTS 

10 Although particular embodiments have been disclosed herein in detail, this has been 

done by way of example for purposes of illustration only, and is not intended to be limiting 
with respect to the scope of the appended claims, which follow, ]|n particular, it is 
contemplated by the mvcntors that various substitutions, alterations, and modiJBcations may be 
made to the invention without dqjarting from the ^irit and scope of the invention as defined 

15 by the clauns. The choice of nucleic acid starting material, clone of interest, or hbraiy type is 
behevcd to be a matter of routine for a person of ordinary skill in the art with knowledge of the 
embodiments described herein. Other aspects, advantages, and modifications considered to be 
within the scope of the following claims. 
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WHAT IS CLAIMED IS: 

An isolated polypeptide comprismg an amino acid sequence selected from the group 
consisting of: 

(a) a mature fom of an amino acid sequence selected from the group consisting of 
SEQ ID N0S:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, and 26; 

(b) a variant of a mature form of an amino acid sequence selected from the group 
consisting of SEQ ID N0S:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, and 26, 
wherein one or more amino acid residues in said variant differs from the amino 
acid sequence of said mature fonn, provided that said variant differs in no more 
than 15% of the amino acid residues from the aroino acid sequence of said 
mature form; ^ 

(c) an amino acid sequence selected from the group consisting SEQ ID N0S:2, 4, 
6, 8, 10, 12, 14, 16, 18, 20, 22, 24, and 26; and 

(d) a variant of an amino acid sequence selected from the group consisting of SEQ 
ID N0S:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, and 26, wherein one or more 
amino acid residues in said variant differs from the amino acid sequence of said 
mature form, provided that said variant differs in no more than 15% of amino 
acid residues from said amino acid sequence. 

The polypeptide of claim 1, wherein said polypeptide comprises the amino acid 
sequence of a naturalIyK)CCUning allehc variant of an ammo acid sequence selected 
from the group consisting SEQ ID N0S:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, and 
26. 

The polypeptide of claim 2, wherein said allelic variant comprises an amino acid 
sequence that is the translation of a nucleic acid sequence differing by a single 
nucleotide from a nucleic acid sequence selected from the group consisting of SEQ ID 
N0S:1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, and 25. 

The polypeptide of claim 1, wherein the amino acid sequence of said variant comprises 
a conservative amino acid substitution. 
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An isolated nucleic acid molecule comprising a nucleic acid sequence encoding a 
polypeptide comprising an amino acid sequence selected from the group consisting of: 

(a) a mature form of an amino acid sequence selected from the group consisting of 
SEQ ID N0S:2, 4, 6, 8. 10, 12, 14, 16, 18, 20, 22, 24, and 26; 

(b) a variant of a mature form of an amino acid sequence selected from the group 
consisting of SEQ ID NOS:2, 4, 6, 8, 10, 12, 17, 19, 21, 23, 25, 29, 31, 33, 35, 
37, 83, and 85, wherein one or more amino acid residues in said variant differs 
from the amino acid sequence of said mature form, provided that said variant 
differs in no more than 15% of the amino acid residues from the amind acid 
sequence of said mature form; 

(c) an amino acid sequence selected from the group consisting of SEQ ID N0S:2, 
4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, and 26; 

(d) a variant of an amino acid sequence selected fit)^ the group consisting SEQ ID 
NOS:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, and 26, wherein one or more 
ammo acid residues in said variant differs from the amino acid sequence of said 
mature form, provided that said variant differs in no more than 15% of amino 
acid residues from said amino acid sequ^ce; 

(e) a nucleic acid fragment encoding at least a portion of a polypeptide comprising 
an amino acid sequence chosen from the group consisting of SEQ ID N0S:2, 4, 
6, 8, 10, 12, 14, 16, 18, 20, 22, 24, and 26, or a variant of said polypeptide, 
wherein one or more amino acid residues in said variant differs from the amino 
acid sequence of said mature form, provided that said variant differs in no more 
than 15% of amino acid residues from said ammo acid sequence; and 

(f) a nucleic acid molecule comprising the complement of (a), (b), (c), (d) or (e). 

The nucleic acid molecule of claim 5, wherehi the nucleic acid molecule comprises the 
nucleotide sequence of a naturally-occurring allelic nucleic acid variant. 

The nucleic acid molecule of claim 5, wherein the nucleic acid molecule encodes a 
polypeptide comprising the amino acid sequence of a naturaUy-occurring polypeptide 
variant. 
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8. The nucleic acid molecule of claim 5, wherein tibie nucleic acid inolecule differs by a 
single nucleotide fiom a nucleic acid sequence selected &bm the group consisting of 
SEQ ID N0S:1, 3, S, 7, 9, 11, 13, 15, 17, 19, 21, 23, and 25. 

9. The nucleic acid molecule of claim 5, wherein said nucleic acid molecule comprises a 
nucleotide sequence selected from ttie group consisting of; 

(a) a nucleotide sequence selected from the group consisting of SEQ ID NOS : 1 , 3, 
5, 7, 9, 11, 13, 15, 17, 19, 21, 23, and 25; 

(b) a nucleotide sequence differing by one or more nucleotides from a nucleotide 
sequence selected from the group consisting of SEQ ID NOS.l, 3, 5, 7, 9, 1 1, 
13, 15, 17, 19, 21, 23, and 25, provided that no more than 20% of the 
nucleotides differ from said nucleotide sequence; 

(c) a imcleic acid fragment of (a); and | 

(d) a nucleic acid fragment of (b). 

10. The nucleic acid molecule of claim 5> wherein said nucleic acid molecule hybridizes 
under stringent conditions to a nucleotide sequence chosen from the group consisting 
SEQ ID N0S:1, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, and 25, or a complement of said 
nucleotide sequence. 

1 1 . The nucleic acid molecule of claim 5, wherein the nucleic acid molecule comprises a 
nucleotide sequence selected from the group consisting of; 

(a) a first nucleotide sequence comprising a coding sequence differing by one or 
more nucleotide sequences from a coding sequence encoding said amino acid 
sequence, provided that no more than 20% of the nucleotides in flie coding 
sequence in said first nucleotide sequence differ from said coding sequence; 

(b) an isolated second polynucleotide that is a complement of the first 
polynucleotide; and 

(c) a nucleic acid fragment of (a) or (b)» 

12. A vector comprising the nucleic acid molecule of claim 1 1 . 

13. The vector of claim 12, fiirther comprising a promoter operably-linked to said nucleic 
acid molecule. 
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14. A c<;;ll comprising the vector of claim 12. 



An antibody that binds immunospecifically to the polypeptide of claim L 

The antibody of claim 15, wherein said antibody is a monoclonal antibody. 

The antibody of claim 15, wherein the antibody is a humanized antibody, 

A method for detemiining the presence or amount of the polypeptide of claim 1 in a 
sample, the method comprising: 

(a) providing the sample; 

(b) contacting the sample with an antibody that binds pmunospecifically to the 
polypeptide; and 

(c) determining the presence or amount of antibody bound to said polypeptide, 
thereby determining the presence or amount of polypeptide in said sample. 

19. A method for determining the presence or amount of the nucleic acid molecule of 
claim 5 in a sample, the method comprising: 

(a) providing the sample; 

(b) contacting the sample with a probe that binds to said nucleic acid molecule; and 

(c) determining the presence or amount of the probe bound to said nucleic acid 
molecule, 

thereby determining the presence or amount of the nucleic acid molecule in said sample. 

20. The method of claim 1 9 wherein presence or amount of the nucleic acid molecule is 
used as a marker for cell or tissue type. 



15. 
16. 
17. 
18. 



21 . The method of claim 20 wherein the cell or tissue type is cancerous. 

22. A method of identifying an agent that bmds to a polypeptide of claim 1 , the method 
comprising: 

(a) contactmg said polypeptide wifli said agent; and 

(b) determining whether said agent binds to said polypeptide. 
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23 . The method of claixn 22 wherein the agent is a cellular receptor or a downstream 
effector. 

24. A method for identifying an agent that n\odulates the expression or activity of the 
polypeptide of claim 1, the method comprising; 

(a) providing a cell expressing said polypeptide; 

(b) contacting the cell with said agent, and 

(c) determining whether the agent modulates expression or activity of said ' 
polypeptide, 

whereby an alteration in expression or activity of said peptide indicates said agent modulates 
expression or activity of said polypeptide. 

25. A method for modulating the octivity of the polypeptide of claim 1 , the method 
comprising contacting a cell sample expressing the polypeptide of said claim with a 
compound that binds to said polypeptide in m amount sufficient to modulate the 
activity of the polypeptide. 

26. A method of treating or preventing a NOVX-associated disorder, said method 
comprising administering to a subject in which such treatment or prevention is desired 
the polypeptide of claim 1 in an amoimt sufficient to treat or prevent said NOVX- 
associated disorder in said subject. 

27. The method of claim 26 wherein the disorder is selected from the group consisting of 
cardiomyopathy and atherosclerosis, 

28. The method of claim 26 wherein the disorder is related to cell signal processing and 
metabolic pathway modulation, 

29. The method of claim 26, wherein said subject is a human. 

30. A method of treating or preventing a NOVX-associated disorder, said method 
comprising administering to a subject in which such treatment or prevention is desired 
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the nucledc acid of claim 5 in an amount sufficient to treat or prevent said NOVX-^ 
associated disorder in said subjecu 

3 1 . The method of claim 30 wherein the disord^ is selected from the group consisting of 
cardiomyopathy and atherosclerosis. 

32. The method of claim 30 wherein the disorder is related to cell signal processing and 
metabolic pathway modulation, 

33. The method of claim 30, wherein said subject is a human, 

34. A method of treating or preventing a NOVX-associated disorder, said method 
comprising admimstering to a subject in which such treajtaaent or prevention is desired 
the antibody of claim 15 in aii amount sufficient to treat or prevent said NOVX- 
associated disorder in said subject. 

35. The method of claim 34 wherein the disorder is diabetes. 

36. The method of claim 34 wherein the disorder is related to cell signal processmg and 
metabolic pathway modulation. 

37. The method of claim 34, wherein the subject is a human, 

38. A pharmaceutical composition comprising tihe polypeptide of claim 1 and a 
pharmaceutically-acceptable carrier. 

39. A pharmaceutical composition comprising the nucleic acid molecule of claim 5 and a 
phaimaceutically-acceptable carrier. 

40. A pharmaceutical composition comprising the antibody of claim 1 5 and a 
pharmaceutically-acceptable carrier. 

41 . A kit comprising in one or more containers, the pharmaceutical composition of claim 
38. 
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42. A kit comprising in one or more contaiiicrs, the pharmaceutical composition of claim 
39. 

43. A kit comprising in one or more containers, the phannaceutical composition of claim 
40. 

44. A method for determining the presence of or predisposition to a disease associated with 
altered levels of the polypeptide of claim 1 in a first mammalian subject, the niethod 
comprising: 

(a) measuring the level of expression of the polypeptide in a sample from the first 
mammalian subject; and 

(b) comparing the amount of said polypeptide in the pample of step (a) to the 
amount of the polypeptide present in a control sample fl^m a second 
mammalian subject known not to have, or not to be predisposed to, said 
disease; 

wherein an alteration in the expression level of the polyp^tide in the first subject as compared 
to the control sample indicates the presence of or predisposition to said disease. 

45. The method of claim 44 wherein the predisposition is to cancers* 

46. A method for determining the presence of or predisposition to a disease associated with 
altered levels of the nucleic acid molecule of claim 5 in a first mammalian subject, the 
method comprising: 

(a) measuring tiie amount of ttie nucleic acid in a sample from the first mammalian 
subject; and 

(b) comparing the amount of said nucleic acid in tiie sample of step (a) to the 
amount of the nucleic acid present in a control sample from a second 
mammalian subject known not to have or not be predisposed to, the disease; 

wherein an alteration in the level of the nucleic acid in the first subject as compared to the 
control sample indicates the presence of or predisposition to the disease. 

47. The method of claim 46 wherein the predisposition is to a cancer. 
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48. A method of treatuig a pathological state in a mammal, the method comprising 

administering to the mammal a polypeptide in an aaxountthat is sufficient to alleviate 
the pathological state, wherein the polypeptide is a polypeptide having an amino acid 
sequence at least 95% identical to a polypeptide comprising an amino acid sequence of 
at least one of SEQ ID N0S:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, and 26, or a 
biologically active fragment thereof, 

49* A method of treating a pathological state in a mammal, the method comprising 
administering to the mammal the antibody of claim 15 in an amount sufBcient io 
alleviate the pathological state. 

n / 
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